How to read snappy compressed a parquet file?



Hi All,

How to read snappy compressed a parquet file ?

Thanks in advance


Demo is done on our state of the art Big Data cluster with Hadoop, Spark etc -

Here is the sample code to generate data in parquet format with compression codec snappy:

val orders ="/public/retail_db_json/orders")
sqlContext.setConf("spark.sql.parquet.compression.codec", "snappy")

Valid options for spark.sql.parquet.compression.codec are uncompressed, gzip, snappy etc. gzip is default


[itversity@gw02 ~]$ hadoop fs -ls orders_snappy
Found 5 items
-rw-r–r-- 3 itversity hdfs 0 2018-04-10 11:33 orders_snappy/_SUCCESS
-rw-r–r-- 3 itversity hdfs 495 2018-04-10 11:33 orders_snappy/_common_metadata
-rw-r–r-- 3 itversity hdfs 1668 2018-04-10 11:33 orders_snappy/_metadata
-rw-r–r-- 3 itversity hdfs 266423 2018-04-10 11:33 orders_snappy/part-r-00000-3dc4646d-67ec-4d3d-8369-2551b6199b39.snappy.parquet
-rw-r–r-- 3 itversity hdfs 268441 2018-04-10 11:33 orders_snappy/part-r-00001-3dc4646d-67ec-4d3d-8369-2551b6199b39.snappy.parquet

How to read data from snappy compressed parquet file?"/user/itversity/orders_snappy").show


Thank you for your reply @dgadiraju sir!
My bad, I was not using the correct path in my case :slight_smile:


You should be more elaborate while raising the issues :slight_smile: