.gz.parquet extension is displaying while using Snappy Compression

Hello All ,

I have used “sqlContext.setConf(“spark.sql.parquet.compress.codec”,“snappy”)” and performed the write.parquet() action but in output file I can see the file extension as “gz.snappy”.

Can anyone elaborate why this happend ?

Regards ,
Amit

Hello Amit,

its compression.codec not compress.codec

check the property name to be set. By default, for parquet, compression is gzip

2 Likes

Yes Arun. With Compression it worked fine. Is Gzip the default Compression ?? If not then why it stored with gz extension since I didn’t mention that?

Yes for parquet files…it is default

Ok. Thanks a lot. Got the solution

hi, parquet default compression type is snappy.
https://spark.apache.org/docs/latest/sql-programming-guide.html#parquet-files.

1 Like