Reading data from local file system in spark

Generaly we load data into spark from hdfs file system but if we want to load data from local file system we can also do. for this

(i) first we need to launch spark session in local mode with the help of below command-
In case of spark-with-scala-

spark2-shell --master local --conf spark.ui.port=0

In case of spark-with-python

pyspark2 --master local --conf spark.ui.port=0

(ii) After this while passing the data path use file:/// prefix in case of local file system like below-

df = spark.read.csv("file:///home/username/.../")

For more information please go through below video-


Learn Spark 1.6.x or Spark 2.x on our state of the art big data labs

  • Click here for access to state of the art 13 node Hadoop and Spark Cluster

1 Like