Sqoop import running error for split-by


#1

kindly support please I am not getting same out with instructor’s

[olanrewajuremi2000@gw03 ~]$ sqoop import \

–connect jdbc:mysql://ms.itversity.com:3306/retail_db
–username retail_user
–password itversity
–table order_items_nopk
–warehouse-dir /user/olanrewajuremi2000/sqoop_import/retail_db
–split-by order_item_order_id
Warning: /usr/hdp/2.5.0.0-1245/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/07/24 19:11:17 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.5.0.0-1245
18/07/24 19:11:17 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
18/07/24 19:11:17 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
18/07/24 19:11:17 INFO tool.CodeGenTool: Beginning code generation
18/07/24 19:11:18 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM order_items_nopk AS t LIMIT 1
18/07/24 19:11:18 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM order_items_nopk AS t LIMIT 1
18/07/24 19:11:18 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.5.0.0-1245/hadoop-mapreduce
Note: /tmp/sqoop-olanrewajuremi2000/compile/f9420cef526531b9cce73b6ec8072996/order_items_nopk.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
18/07/24 19:11:19 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-olanrewajuremi2000/compile/f9420cef526531b9cce73b6ec8072996/order_items_nopk.jar
18/07/24 19:11:19 WARN manager.MySQLManager: It looks like you are importing from mysql.
18/07/24 19:11:19 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
18/07/24 19:11:19 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
18/07/24 19:11:19 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
18/07/24 19:11:19 INFO mapreduce.ImportJobBase: Beginning import of order_items_nopk
18/07/24 19:11:21 INFO impl.TimelineClientImpl: Timeline service address: http://rm01.itversity.com:8188/ws/v1/timeline/
18/07/24 19:11:21 INFO client.RMProxy: Connecting to ResourceManager at rm01.itversity.com/172.16.1.106:8050
18/07/24 19:11:21 INFO client.AHSProxy: Connecting to Application History server at rm01.itversity.com/172.16.1.106:10200
18/07/24 19:11:21 ERROR tool.ImportTool: Encountered IOException running import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://nn01.itversity.com:8020/user/olanrewajuremi2000/sqoop_import/retail_db/order_items_nopk already exists
at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:146)
at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:266)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:139)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308)
at org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:200)
at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:173)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:270)
at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692)
at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:127)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:507)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:615)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:225)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
at org.apache.sqoop.Sqoop.main(Sqoop.java:243)


#2

run following on command line:

hdfs dfs -rm -R /user/olanrewajuremi2000/sqoop_import/retail_db/order_items_nopk

and then run Sqoop import again


#3

It can run.

kindly explain what happened.
could it be clean up you did or …


#4

@Chris It is clearly saying that path already exists on HDFS.

Solution

  • You can delete the path if the file is not required
  • You can change to a new path.