Spark_demo program

pyspark

#1

In first spark_demo program I am having an error “WindowsError: [Error 2] The system cannot find the file specified”. Please suggest how to resolve this issue.


#2

Can you paste the entire error log in the text format? Can you go to Windows file explorer and check whether you have deckofcards.txt file under C drive?

Please share the screenshot as well.

@BaLu_SaI please help Sudhanshu and bring the issue to closure.


#3

Good Morning,

Please find the details.

from pyspark import SparkConf, SparkContext
sc = SparkContext(master=“local”, appName=“spark_demo”)
print(sc.textFile(“c:\deckofcards.txt”).first())

Note:- File have given all the required permission.Even I tried to change the file location still having the same error.

Error log :-

C:\Users\hp-pc\PycharmProjects\Spark_Project\venv\Scripts\python.exe C:/Users/hp-pc/PycharmProjects/Spark_Project/spark_demo.py
Traceback (most recent call last):
File “C:/Users/hp-pc/PycharmProjects/Spark_Project/spark_demo.py”, line 2, in
sc = SparkContext(master=“local”, appName=“spark_demo”)
File “C:\spark-1.6.3-bin-hadoop2.6\python\pyspark\context.py”, line 112, in init
SparkContext._ensure_initialized(self, gateway=gateway)
File “C:\spark-1.6.3-bin-hadoop2.6\python\pyspark\context.py”, line 245, in _ensure_initialized
SparkContext._gateway = gateway or launch_gateway()
File “C:\spark-1.6.3-bin-hadoop2.6\python\pyspark\java_gateway.py”, line 79, in launch_gateway
proc = Popen(command, stdin=PIPE, env=env)
File “C:\Python27\Lib\subprocess.py”, line 390, in init
errread, errwrite)
File “C:\Python27\Lib\subprocess.py”, line 640, in _execute_child
startupinfo)
WindowsError: [Error 2] The system cannot find the file specified


#4

Where I am committing mistacks ?


#5

Your setup is not proper. You need to follow these instructions and set up the environment.

https://kaizen.itversity.com/setup-spark-development-environment-pycharm-and-python/


#6

I have installed all the components (python, java, pycharm, spark and winutils) as per given instructions also edited the system variables. On console I am able to read the file through the spark context but when the same code I tried to run on pycharm, it’s throwing below error. As shown in below error “WindowsError: [Error 2] The system cannot find the file specified”I do believe passing path name is correct as in same manner I write a program to alter the file name in folder and it’s working file.
I have been facilitated for one month free subscription for lab access but I am not able to access data-master folder as shown in video.
C:\Users\hp-pc\PycharmProjects\gettingstarted\venv\Scripts\python.exe C:/Users/hp-pc/PycharmProjects/gettingstarted/sparkDemo.py
Traceback (most recent call last):
File “C:/Users/hp-pc/PycharmProjects/gettingstarted/sparkDemo.py”, line 2, in
sc = SparkContext(master=“local”, appName=“Spark_demo”)
File “C:\spark-1.6.3-bin-hadoop2.6\python\pyspark\context.py”, line 112, in init
SparkContext._ensure_initialized(self, gateway=gateway)
File “C:\spark-1.6.3-bin-hadoop2.6\python\pyspark\context.py”, line 245, in _ensure_initialized
SparkContext._gateway = gateway or launch_gateway()
File “C:\spark-1.6.3-bin-hadoop2.6\python\pyspark\java_gateway.py”, line 79, in launch_gateway
proc = Popen(command, stdin=PIPE, env=env)
File “C:\Python27\Lib\subprocess.py”, line 390, in init
errread, errwrite)
File “C:\Python27\Lib\subprocess.py”, line 640, in _execute_child
startupinfo)
WindowsError: [Error 2] The system cannot find the file specified

Process finished with exit code 1
On PYCHARM IDE–

from pyspark import SparkConf, SparkContext
sc = SparkContext(master=“local”, appName=“Spark_demo”)
filePath = "D:\Sudhanshu\Big_data\Udemy\deckofcards.txt”
print(sc.textFile(filePath).first())


#7

Same code processed successfully on console.
sc.textFile(“D:\Sudhanshu\Big_data\Udemy\deckofcards.txt”).first()22


#8

Setting screen shot on pycharm


#9


#10

@vinodnerella or @BaLu_SaI, please do the screen share and try to resolve the issue.


#11

@sudhanshu_shekhar
can you share your team viewer id or can u connect with us in zoom


#12

I have already shared all the screen shot of error and setting (python and spark). I am not aware of my team viewer id. Let me know how to find it. Regarding to connect over zoom, Please send me the zoom meeting link, I will be over there on the schedule time. (Monday to Friday not be available between 11:30 am to 12 PM)
Let us know the time to do screen-share to resolve the spark demo program issue


#13

@sudhanshu_shekhar
Install the package pyspark and then run the program
Go to File -> Settings -> Project Interpreter
Click on install button and search for PySpark


#14

@Sunil_Itversity, PySpark package has been installed in PyCharm IDE, still having the same issue.


#15

@Sunil_Itversity, Please send me the zoom meeting link, I will be over there on the schedule time. (Monday to Friday not be available between 11:30 am to 12 PM)


#16

hii @sudhanshu_shekhar u can connect with me now in zoom here is the id 971 175 474


#17

@Sunil_Itversity, Request you please setup zoom meeting for Today.


#18