Pyspark - saveAsTextFile


Hi I was trying to save the file back to HDFS after reading it in spark.
When i try to view the textfile after saving it, i could see more files are created as split.
Can anyone help me to understand how it will be saved to many files.

In youtube durga sir vidoes, there was only 2 files which is SUCCESS and part_* are created.
For me im getting as below,

@Janaki_K - Could you please past the queries which you used ?

dataRDD = sc.textFile("/user/kjanakijanu/sqoop_import/departments")
for line in dataRDD.collect():



You might have used -m n or --num-mappers n in your sqoop which might have created n part files

1 Like

Yes. now i understood. Thank you!