Saving Text File in snappy compression


Is there any way to save data in textFile using snappy compression. It is failing when I am trying to save a RDD.

Learn Spark 1.6.x or Spark 2.x on our state of the art big data labs

  • Click here for access to state of the art 13 node Hadoop and Spark Cluster


Try this…


This is wrong,
Correct syntax is as follow (but that is also not working):


It is working for me.

What is the error u r getting? Can you provide the details


Are you using Scala or Python?


I am using Scala @dgadiraju


This is what I am getting when I try your piece of code.
:28: error: overloaded method value saveAsTextFile with alternatives:
(path: String,codec: Class[_ <:])Unit
(path: String)Unit
cannot be applied to (String, compressionCodecClass: String)>x.mkString(",")).saveAsTextFile("/user/cloudera/text-snappy",compressionCodecClass=“”)


@dgadiraju @connectsachit
With Scala compressionCodecClass is giving an error which Sachit has mentioned. So i used, classOf[] and it is working.


Below is the o/p
With Snappy

-rw-r–r-- 3 sarangdp1 hdfs 0 2018-04-30 07:21 snappyOp3/_SUCCESS
-rw-r–r-- 3 sarangdp1 hdfs 145212 2018-04-30 07:21 snappyOp3/part-00000.snappy
-rw-r–r-- 3 sarangdp1 hdfs 150121 2018-04-30 07:21 snappyOp3/part-00001.snappy

Without Snappy…Normal Save
-rw-r–r-- 3 sarangdp1 hdfs 0 2018-04-30 07:23 normal/_SUCCESS
-rw-r–r-- 3 sarangdp1 hdfs 476627 2018-04-30 07:23 normal/part-00000
-rw-r–r-- 3 sarangdp1 hdfs 483957 2018-04-30 07:23 normal/part-00001


@sarang, thank you for responding to the question. Let us build a decent itversity community :slight_smile:


I have also tried the same code. But it is showing error for me:


Sachit…looks like you are not running the code on itverisity labs…it doesnt have the snappy jar set the in the class path…

If you are in hdp cluster, check the $HADOOP_HOME/lib…it should have the snappy jar and the $HADOOP_HOME/lib/native “”

  1. LD_LIBRARY_PATH and JAVA_LIBRARY_PATH contains the native directory path having the** files.
  2. LD_LIBRARY_PATH and JAVA_LIBRARY path have been exported in the SPARK environment(

Error while saveAsTextFile with Snappy Codec

It seems you are running on Cloudera QuickStart VM.

Please go to /etc/hadoop/conf/core-site.xml and search for compression codecs to see if it includes Snappy.


It does not includes any compression codec. But when I did sqoop import using snappy compression, it worked. Also When I saved using Spark SQL in snappy it worked. Any reason for that?


Hello Durga Sir

I am also facing same issue and I amusing Pyspark in cloudera. I checked /etc/hadoop/conf/core-site.xml and did not find any ‘compression’ or ‘compress’ keyword.

Please suggest a solution. I am wondering this happens in exam then what could be the workaround.