H1b data without a header


I am working on the h1b data and I need to remove the header and then save the subsequent data on HDFS using Snappy compression.

I am having difficult saving the data on HDFS under Snappy compression using Scala.

I know in PySpark, the solution would be:


However, with Scala, I do not know how to save textile under Snappy compression and sorry if a stupid question.

Can you please help me on this issue?

Many thanks,

h1b_data.saveAsTextFile(’/user/por160893/problem9/solution/’, compressionCodecClass = ‘org.apache.hadoop.io.compress.SnappyCodec’)