DataFrame from user generated data

#1

Hell Sir,I have data like below
1,"Ram"
2,"Raju"
3,"Santosh"
In simply I have data of multiple students.
I want to make data frame out of this.This is dummy data, not read from any file.
I can’t use case class.

0 Likes

#2

Why you can’t use case class?

0 Likes

#3

You can use parallelize method to create an RDD.

personsRDD = [[1, “Ram”], [2, “Raju”], [3, “Santosh”]]

Create schema using StructType

schema = StructType([StructField(“id”, IntegerType(), True), StructField(“name”, StringType(), True)])

Create a DataFrame from the personsRDD using Schema

personsDataFrame = sqlContext.createDataFrame(personsRDD)

0 Likes

#4

case class has limitation to contain 22 fields only. I have to add field data in iteration.So dynamic case class is not possible.

Ex:

for(i<-0 until fieldSize)
{
rowList.add(field) //dynamic fields added to each row and fieldSize is dynamic
}

//then each class will be collected in collection Ex:Collection of multiple students

0 Likes