Sqoop Import-all-tables in Hive

Hello All,

I ran following sqoop command to import all tables to Hive.

sqoop import-all-tables --connect “jdbc:mysql://quickstart.cloudera:3306/retail_db” --username=retail_dba --password=cloudera --hive-import --hive-overwrite --create-hive-table --compress --compression-codec org.apache.hadoop.io.compress.SnappyCodec --outdir java_files

As I did not specify number of mappers, default 4 mappers were used.

Now when i browse the cloudera directory and hive warehouse directory i get different count of files.

hive> dfs -ls /user/cloudera/categories;
Found 5 items
-rw-r–r-- 1 cloudera cloudera 0 2016-12-24 00:53 /user/cloudera/categories/_SUCCESS
-rw-r–r-- 1 cloudera cloudera 273 2016-12-24 00:53 /user/cloudera/categories/part-m-00000.snappy
-rw-r–r-- 1 cloudera cloudera 216 2016-12-24 00:53 /user/cloudera/categories/part-m-00001.snappy
-rw-r–r-- 1 cloudera cloudera 226 2016-12-24 00:53 /user/cloudera/categories/part-m-00002.snappy
-rw-r–r-- 1 cloudera cloudera 228 2016-12-24 00:53 /user/cloudera/categories/part-m-00003.snappy

hive> dfs -ls /user/hive/warehouse/categories;
Found 1 items
-rwxrwxrwx 1 cloudera cloudera 866 2016-12-04 00:18 /user/hive/warehouse/categories/part-m-00000.snappy

Can someone plz explain the logic behind this.

In case of hive-import, the only HDFS location where the tables would be imported is /user/hive/warehouse. Moreover, the files imported cannot be present in 2 different HDFS locations.I guess you are checking the incorrect location - /user/cloudera/categories for comparison.

Thanks Buddy. Is there a way to override the default location?

Also looks like Sqoop import to Hive uses /user/cloudera directory for intermediate storage. Because when i ran same command again it gave me error as directory already exists.

@rahulabvp - You can use --hive-home to override the default location.

if the directly already exist then use hadoop fs -rm -R to delete the directory

You can go-through below post its useful…