Pyspark and pycharm issue

pyspark
apache-spark

#1

Basically i am trying to read a CSV and write to a Parquet file. while using the edge node i am able to do that.

[shabigdata@gw03 ~]$ pyspark

t1 = sc.textFile(“data001.txt”)
t2=t1.map(lambda x:x.split(","))
t3=sqlContext.createDataFrame(t2);
±-------±----------------±—±------------------±—±-------+
| _1| _2| _3| _4| _5| _6|
±-------±----------------±—±------------------±—±-------+
|91900100| Aanonsen Hugh A |Misc|1998-01-17T00:00:00|2016|12392.02|
|91900101|Aaronson Julie F |Misc|1994-06-18T00:00:00|2016| 13537.5|
|91900102| Abad Jr Joseph|Misc|2015-10-24T00:00:00|2016|84671.55|
|91900103| Abad Fernando J|Misc|1992-03-28T00:00:00|2016|33980.45|
|91900104| Abad Jack C|Misc|1990-07-01T00:00:00|2016|72738.07|
±-------±----------------±—±------------------±—±-------+

t3.write.parquet(“outputpq2”)
t3.limit(5).show()

[shabigdata@gw03 ~]$ hadoop fs -ls hdfs://nn01.itversity.com:8020/user/shabigdata/outputpq1
Found 5 items
-rw-r–r-- 2 shabigdata hdfs 0 2018-10-18 20:27 hdfs://nn01.itversity.com:8020/user/shabigdata/outputpq1/_SUCCESS
-rw-r–r-- 2 shabigdata hdfs 567 2018-10-18 20:27 hdfs://nn01.itversity.com:8020/user/shabigdata/outputpq1/_common_metadata
-rw-r–r-- 2 shabigdata hdfs 2101 2018-10-18 20:27 hdfs://nn01.itversity.com:8020/user/shabigdata/outputpq1/_metadata
-rw-r–r-- 2 shabigdata hdfs 399821 2018-10-18 20:27 hdfs://nn01.itversity.com:8020/user/shabigdata/outputpq1/part-r-00000-5974cd57-7ec7-40dd-8fd3-5d38ae11cadb.gz.parquet
-rw-r–r-- 2 shabigdata hdfs 399716 2018-10-18 20:27 hdfs://nn01.itversity.com:8020/user/shabigdata/outputpq1/part-r-00001-5974cd57-7ec7-40dd-8fd3-5d38ae11cadb.gz.parquet

If i use the pycharm IDE i am getting the error, please check the pycharm_IDE error log and for settings please see the attached documents. I went through durga’s complete course, but unfortunately i am not able to do a set up.