Hell Sir,I have data like below
1,"Ram"
2,"Raju"
3,"Santosh"
In simply I have data of multiple students.
I want to make data frame out of this.This is dummy data, not read from any file.
I can’t use case class.
Why you can’t use case class?
You can use parallelize method to create an RDD.
personsRDD = [[1, “Ram”], [2, “Raju”], [3, “Santosh”]]
Create schema using StructType
schema = StructType([StructField(“id”, IntegerType(), True), StructField(“name”, StringType(), True)])
Create a DataFrame from the personsRDD using Schema
personsDataFrame = sqlContext.createDataFrame(personsRDD)
case class has limitation to contain 22 fields only. I have to add field data in iteration.So dynamic case class is not possible.
Ex:
for(i<-0 until fieldSize)
{
rowList.add(field) //dynamic fields added to each row and fieldSize is dynamic
}
//then each class will be collected in collection Ex:Collection of multiple students