Execute pyspark code on Cluster using PyCharm IDE

pyspark

#1

Hi,
Has anyone tried PyCharm IDE to run the pyspark jobs on remote cluster. If yes, could you please let me know the steps.

Thanks.


#2

@sushantas

Your requirement is PyCharmIDE is installed in your desktop from that you need to connect to itversity labs or some other remote cluster where Spark is installed. This feature is called remote kernels, for this itversity labs should be able to equip with that along with that, you should have PyCharm IDE commercial edition as community editions are lacking this feature.

Better alternative is Jupyter Notebooks, it is widely used in industry. But even for that, Itversity labs should be equipped with that.


#3

@ravi.tejarockon

Hi Ravi,
Yes, my requirement is I have PyCharm professional edition installed on my desktop. I want to code pyspark program using the IDE and execute it on Hadoop Cluster installed on Ubuntu VM installed on my machine.

I know there is a deployment on remote cluster option available only on the professional edition of PyChar, but I am facing issues while using it as it is not able to locate the spark installed on my Hadoop cluster in VM.

Would be of help if you can share some insights on this.


#4

Sure @sushantas

Please refer this link:
Configuring Remote Interpreters via SSH


#5

@ravi.tejarockon

Have already followed these steps. These don’t help much though.


#6

@sushantas

I can understand the difficulty of VM networking & that too remote interpreters. If suppose this VM is custom built (i mean VM is installed by you) then you can go with Vagrant. With Vagrant machines you can handle networking very well, else with VMware workstation or VirtualBox there going to be tough time to configure it.

Please go thru this Durga Sir’s Playlist:
Setup Development Environment for Python & Spark


#7

@sushantas,

An update for you, recently i have successfully accomplished VM networking from Guest to Host using VMware workstartion Pro.

I have enabled port mapping in Virtual network editor & started accessing VM applications outside.

Let me know if you’re using VMware workstartion Pro or any other version?


#8

Hi Ravi,
That’s great. Could you please share the steps. I am using VMWare Workstation Pro 14.1


#9

@sushantas,

Please refer below URL for the settings:
https://www.vmware.com/support/ws5/doc/ws_net_nat_advanced.html