Sqoop import: compression not working with parquet/avro


#1

sqoop import
–connect jdbc:mysql://ms.itversity.com:3306/retail_db
–username retail_user
–password itversity
–table orders
–target-dir /user/kirantadisetti/sqoop_import
–num-mappers 5
–as-parquetfile
–compress
–compression-codec org.apache.hadoop.io.compress.SnappyCodec

compression is not working , files are created without compression when import as avro/parquet

tested for text files , its working

[kirantadisetti@gw02 conf]$ hadoop fs -ls /user/kirantadisetti/sqoop_import
Found 7 items
drwxr-xr-x - kirantadisetti hdfs 0 2018-10-09 22:49 /user/kirantadisetti/sqoop_import/.metadata
drwxr-xr-x - kirantadisetti hdfs 0 2018-10-09 22:50 /user/kirantadisetti/sqoop_import/.signals
-rw-r–r-- 2 kirantadisetti hdfs 114337 2018-10-09 22:50 /user/kirantadisetti/sqoop_import/1f99d2f6-56ee-4e81-8c1b-1edb01db2ae5.parquet
-rw-r–r-- 2 kirantadisetti hdfs 114446 2018-10-09 22:50 /user/kirantadisetti/sqoop_import/2c8dd335-7546-4a3c-81ae-b4dbc0d9d2b5.parquet
-rw-r–r-- 2 kirantadisetti hdfs 114206 2018-10-09 22:50 /user/kirantadisetti/sqoop_import/407a989b-d775-401f-8568-0afccaad09f9.parquet
-rw-r–r-- 2 kirantadisetti hdfs 114286 2018-10-09 22:50 /user/kirantadisetti/sqoop_import/47b2e660-d71e-4df4-833e-90248c128809.parquet
-rw-r–r-- 2 kirantadisetti hdfs 118492 2018-10-09 22:50 /user/kirantadisetti/sqoop_import/996a8b4c-31b4-430b-9be9-bf857d64f428.parquet


#2

@Kiran_Tadisetti You can find the compression information in metadata.

hadoop fs -cat /user/sseashu1/sqoop_import_c/.metadata/descriptor.properties
#Dataset descriptor for sqoop_import_c
#Wed Oct 10 01:56:55 EDT 2018
location=hdfs\://nn01.itversity.com\:8020/user/sseashu1/sqoop_import_c
version=1
compressionType=snappy
format=parquet