Reading data from Json file in pyspark:issue

Hi Team,

Why we can’t use the first way of printing the data from json file. why we need to register a temporary table if we can get the same output directly.?

dep=sqlcontext.jsonFile("/user/adityagtiwari92/department.json")
for i in dep.collect():
… print(i)

Row(department_id=1, department_name=u’PCM’)
Row(department_id=2, department_name=u’PCB’)
Row(department_id=3, department_name=u’Commerce’)
Row(department_id=4, department_name=u’Science’)
Row(department_id=5, department_name=u’Arts’)

dep.registerTempTable(“depTempTable”)
dept=sqlcontext.sql(“select * from depTempTable”)
for i in dept.collect():
… print(i)

Row(department_id=1, department_name=u’PCM’)
Row(department_id=2, department_name=u’PCB’)
Row(department_id=3, department_name=u’Commerce’)
Row(department_id=4, department_name=u’Science’)
Row(department_id=5, department_name=u’Arts’)

After you register as temp table, you can query on the dataset using traditional sql for further transformation and actions to get the results you need as per problem statement.