it looks like the spark 1.6 does not have prloaded csv api. Could any one help me. how do i install the external jar files to my workspace in pycharm. so that I can use the read.csv in my current project. I am using python 2.7
Read.csv is not working in pyspark with spark 1.6 and python 2.7
rdd = sc.textFile("/tmp/flight.csv").map(lambda x: x.split(","))
schema = rdd.first()
data = rdd.filter(lambda x: x != schema)
NOTE: this method will not inferSchema, thus your schema data types are all set to string.