Sqoop import snappy compression with parquet file format not working


#1

Hi ,

i am trying the below command ,

sqoop import \
-Dorg.apache.sqoop.splitter.allow_text_splitter=true \
--connect jdbc:mysql://ms.itversity.com:3306/retail_db     \
--username  retail_user --password  itversity   \
--table orders \
--target-dir /user/sameerrao20118/sqoop_import/retail_db/orders \
--as-avrodatafile \
--compress \
--compression-codec org.apache.hadoop.io.compress.SnappyCodec 

However i dont see the files with .snappy extension … only if use default file format or text file format snappy compression works, Any idea why is that happening or am i doing something wrong?

regards
sameer


#2

Hi @Sameer_Rao,

Avro files cannot be compressed using Snappy. The best way to do is import the data using sqoop with Snappy compression first and then apply transformations using spark and save as avro file, it will save the data in avro file with Snappy Compression.

Follow this topic for more information.