H1b data without a header

#1

Hello,

I am working on the h1b data and I need to remove the header and then save the subsequent data on HDFS using Snappy compression.

I am having difficult saving the data on HDFS under Snappy compression using Scala.

I know in PySpark, the solution would be:

h1b_data.saveAsTextFile(’/user/por160893/problem9/solution/’,‘org.apache.hadoop.io.compress.SnappyCodec’)

However, with Scala, I do not know how to save textile under Snappy compression and sorry if a stupid question.

Can you please help me on this issue?

Many thanks,
Patrick

0 Likes