Setting Up Winutils for HDFS and Windows File System Integration

pyspark

#1

Having lunch pyspark successfully, I have got the following response when trying to validate

Traceback ( most recent call last):
File “”, line 1, in
AttributeError: ‘SparkContext’ object has no attribute ‘txtFile’

where could I have gone wrong please kindly assist.


#2

Hi @Chris

Can you share the command what you have tried?


#3

sc.txtFile(“C:\deckofcards.txt”).first()

I am getting this erroneous output from pyspark application using pycharm
after running the code:
from pyspark import SparkConf, SparkContext

sc = SparkContext (master= “local”, appName= “Spark Demo”)
print(sc.txtFile(“C:\deckofcards.txt”).first())

output:
File “C:\Users\CHRIS\PycharmProjects\gettingstarted\venv\lib\os.py”, line 425, in getitem
return self.data[key.upper()]
KeyError: ‘SPARK_HOME’

kindly assist.
Thank you


#4

I am also confronted with creating

c:/hadoop/bin directory in C: for witusils…

how do go about the forward slash please?


#5

Hi @Chris

To read the text file in spark we need to give as below so change your print function and let us know

print(sc.textFile(“C:\deckofcards.txt”).first())

Please follow this blog to Setup Spark Development Environment – PyCharm and Python

Regards,
Sunil Abhishek


#6

Thank you Sunil, it worked perfectly and the output was what i expected .
Besides I await your response to the pycharm request for please.

Concerning Ubuntu installation. I ran as administrator but same error message continues. it was initially running after launching but later changed to the red error message.
Kindly assist please.