Read.csv is not working in pyspark with spark 1.6 and python 2.7


it looks like the spark 1.6 does not have prloaded csv api. Could any one help me. how do i install the external jar files to my workspace in pycharm. so that I can use the read.csv in my current project. I am using python 2.7


Try this.

rdd = sc.textFile("/tmp/flight.csv").map(lambda x: x.split(","))
schema = rdd.first()
data = rdd.filter(lambda x: x != schema)

NOTE: this method will not inferSchema, thus your schema data types are all set to string.