How to read data from file system (csv, json, etc) to DataFrame


#1

What I have found from googling are mostly about loading data from hdfs into dataframe, I wonder what’s the best practice to load data from file system.

For RDD, I know I can do this way:
val productsRaw = scala.io.Source.fromFile("/data/retail_db/products/part-00000")

For DataFrame, how do I write my code?

Thank you very much.


#2

@paslechoix:

After your above step, create RDD then convert your RDD into DF using .toDF()
Thanks
Venkat


#3

@paslechoix
You can read a text file to create a dataframe

sqlContext.read.text(HDFS path)