Sqoop hive-import


#1

Hi,

I am trying to execute below sqoop import-all command but, I am getting below error. I am using itversity big-data lab.I understand that sqoop will load in to hive default path /apps/hive/warehouse and also the categories table is already available in the default location. why it is throwing error even though I give --hive-overwrite in the sqoop command?

sqoop import-all-tables --connect “jdbc:mysql://nn01.itversity.com:3306/retail_db” --username retail_dba --password itversity --hive-import --hive-overwrite --create-hiv
e-table --compress --compression-codec org.apache.hadoop.io.compress.SnappyCodec --outdir java-files

           Bytes Written=943

17/03/06 17:32:48 INFO mapreduce.ImportJobBase: Transferred 943 bytes in 20.3954 seconds (46.2358 bytes/sec)
17/03/06 17:32:48 INFO mapreduce.ImportJobBase: Retrieved 58 records.
17/03/06 17:32:48 INFO mapreduce.ImportJobBase: Publishing Hive/Hcat import job data to Listeners
17/03/06 17:32:48 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM categories AS t LIMIT 1
17/03/06 17:32:48 INFO hive.HiveImport: Loading uploaded data into Hive

Logging initialized using configuration in jar:file:/usr/hdp/2.5.0.0-1245/hive/lib/hive-common-1.2.1000.2.5.0.0-1245.jar!/hive-log4j.properties
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. AlreadyExistsException(message:Table categories already exists)

Thanks,
Vimala


#2

Added following , which created tables in my own database in hive
–hive-database rkarra07 \


#3

Thanks for your sugession. Yes even I am able to import the data to my own data base by adding the above line. But I am trying to understand the purpose of --hive-overwirte. Could anyone please me understand how it works?


#4

.hive-overwrite simply overwrites existing data in the table you are trying to import. It does not drop the table if it exists and re-create it, as you are expecting.


#5

@shaan0056 thanks for the reply. But my expectation is when the table is already exists, and we when we give --hive-overwrite, only the data must be copied without throwing DDL already exists exception. Please give me any scenario that best suits for -hive-overwrite.


#6

Try removing --create-hive-table and run the sqoop command again.


#7

Step 1:
sqoop import-all-tables --connect “jdbc:mysql://nn01.itversity.com:3306/retail_db” --username retail_dba --password itversity --hive-import –hive-home “dir-name” --compress --compression-codec org.apache.hadoop.io.compress.SnappyCodec --outdir java-files

Step 2:
Validate if the data is imported in the mentioned directory.

Step 3:
Insert few records into departments table in mysql.

Step4
Execute following query
sqoop import-all-tables --connect “jdbc:mysql://nn01.itversity.com:3306/retail_db” --username retail_dba --password itversity --hive-import –hive-home “dir-name” --hive-overwrite --compress --compression-codec org.apache.hadoop.io.compress.SnappyCodec --outdir java-files

You can try this with single table using sqoop-import rather than sqoop-import-all-tables for quick results.


#8

@shaan0056 Yes I removed the create-hive-table it worked for me. I now understand that having both hive-overwrite and create-hive-table is meaningless. Thank you.

@Padma Thank you so much for the suggestion. I will surely try them.


#9

Hi All,

Trying to import one table to hive an dgetting below error

sqoop import
—connect jdbc:mysql://ms.itversity.com:3306/retail_db \
–username retail_user
–password itversity
-–split-by department_id
–table departments
-–hive-database bhuvasr1
-–hive-import
-–create-hive-table
-–hive-table bhuvasr1.departmentshive -m 1

119/03/10 13:33:01 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.6.5.0-292
19/03/10 13:33:01 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
19/03/10 13:33:01 ERROR tool.BaseSqoopTool: Error parsing arguments for import:
19/03/10 13:33:01 ERROR tool.BaseSqoopTool: Unrecognized argument: -–split-by
19/03/10 13:33:01 ERROR tool.BaseSqoopTool: Unrecognized argument: department_id
19/03/10 13:33:01 ERROR tool.BaseSqoopTool: Unrecognized argument: -–table
19/03/10 13:33:01 ERROR tool.BaseSqoopTool: Unrecognized argument: departments
19/03/10 13:33:01 ERROR tool.BaseSqoopTool: Unrecognized argument: -–hive-database
19/03/10 13:33:01 ERROR tool.BaseSqoopTool: Unrecognized argument: bhuvasr1
19/03/10 13:33:01 ERROR tool.BaseSqoopTool: Unrecognized argument: -–hive-import
19/03/10 13:33:01 ERROR tool.BaseSqoopTool: Unrecognized argument: -–create-hive-table
19/03/10 13:33:01 ERROR tool.BaseSqoopTool: Unrecognized argument: -–hive-table
19/03/10 13:33:01 ERROR tool.BaseSqoopTool: Unrecognized argument: bhuvasr1.departmentshive
19/03/10 13:33:01 ERROR tool.BaseSqoopTool: Unrecognized argument: -m
19/03/10 13:33:01 ERROR tool.BaseSqoopTool: Unrecognized argument: 1


#10

@Srikanth You have given incorrect syntax.

You have to use a double hyphen (--)

sqoop import \
--connect jdbc:mysql://ms.itversity.com:3306/retail_db \
--username retail_user \
--password itversity \
--split-by department_id \
--table departments \
--hive-database annu \
--hive-import \
--create-hive-table \
--hive-table departmentshive -m 1

#11

Thank you , it worked.

could you please real time syntax for sqoop


#12

@Srikanth Please follow below URL for Sqoop examples.


#13

Thank you somuch.

Your list of commands missed the sqoop jobs and sqoop incremental import commands :slight_smile:
and also sqoop export commands


#14

@Srikanth open the below link it has sqoop job and incremental import commands https://github.com/dgadiraju/code/blob/master/hadoop/edw/cloudera/sqoop/sqoop_demo.txt