Creating a Dataframe from Existing RDD

pyspark

#1

Hi All,
I am trying to create a dataframe from a rdd. (lineeven is an rdd)
lineeven.take(2)
[[u’1,1,957,1,299.98,299.98’], [u’2,2,1073,1,199.99,199.99’]]

cSchema=structType([structField(“order_id”,StringType()),
structField(“order_number”,IntegerType()),
structField(“order_item_id”,IntegerType()),
structField(“order_num”,IntegerType()),
structField(“order_quantity”,IntegerType()),
structField(“order_quality”,InterType())])

linedf = spark.createDataFrame(lineeven, “cSchema”)

Is it the correct way to do it ?

Learn Spark 1.6.x or Spark 2.x on our state of the art big data labs

  • Click here for access to state of the art 13 node Hadoop and Spark Cluster


#2

@Nishant_varma Can you share the error as well?