AttributeError: 'PipelinedRDD' object has no attribute 'toDf'



This is my code
from pyspark.sql import Row
orders = sc.textFile("/public/retail_db/orders") x:(Row(order_id=(int(x.split(",")[0])),order_date=(x.split(",")[1]),order_customer_id=(int(x.split(",")[2])),order_status=(x.split(",")[3])))).toDf()

Learn Spark 1.6.x or Spark 2.x on our state of the art big data labs

  • Click here for access to state of the art 13 node Hadoop and Spark Cluster


there is a typo in your code, it is supposed to be toDF()


Thanks for pointing it out.