I am trying to execute pyspark code in itversity lab environment. I am using pandas and numpy in my pyspark code. But when I submit in itversity lab environment, an error is thrown saying import of pandas module failed.
so I added a small function which imports dependencies through RDD.
from pyspark import SparkContext
import pandas as pd
import numpy as np
sc = SparkContext()
int_rdd = sc.parallelize([1, 2, 3, 4])
int_rdd.map(lambda x: import_dependencies(x)).collect()
But even this failed. How to submit pyspark job with dependencies.
Can anyone help me out?