Issues configuring pyspark in Ubuntu



Hello everyone,

I’m trying to configure my environment for spark and python in an OS Ubuntu 14.04 Desktop. I have already running Spark 1.6.2 with a python version 3.5. My next step ought to be create pythons scripts with SparkContext in order to play with RDD ant etc, etc.

My problem is basically that I have spend long time along try to setup the pyspark thru a broad commands sets in Ubuntu.

However I always get errors because my VM is not able to reach out the server or the package. In addition , I have to try to install the pip with : sudo apt-get install pip -y and apparently it worked but now when I run my:

pip install pyspark - I realize this is just for Spark version 2.0 then I really dont know how to setup this final part.

I appreciate your support as always team, thanks.


Hi @Andres_Angel,

I have understood your issues, since you’re facing issues with setting up PySpark environment to practice

So what i suggest is to go with pre-setup Spark cluster for you. For example Databricks (free option) or itversity (paid but all cluster is perfectly set for you) labs or some other options.
Since your main goal was to get good hands-on with Development practice, better don’t deviate with administration activities, cause it has it’s own depth.