SQOOP Import error message giving incorrect path

The out directory name I provide on sqoop import command is different from what is found on the error message:

Input: --outdir hdfs://nn01.itversity.com:8020/user/acrajesh/test

Error: 17/03/26 21:26:28 ERROR tool.ImportTool: Encountered IOException running import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://nn01.itversity.com:8020/user/acrajesh/categories already exists

[acrajesh@gw01 ~]$ sqoop import --connect “jdbc:mysql://nn01.itversity.com:3306/retail_db” --table=categories --username=retail_dba --password=itversity --outdir hdfs://nn01.itversity.com:8020/user/acrajesh/test
Warning: /usr/hdp/2.5.0.0-1245/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
17/03/26 21:26:25 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.5.0.0-1245
17/03/26 21:26:25 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
17/03/26 21:26:25 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
17/03/26 21:26:25 INFO tool.CodeGenTool: Beginning code generation
17/03/26 21:26:25 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM categories AS t LIMIT 1
17/03/26 21:26:25 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM categories AS t LIMIT 1
17/03/26 21:26:25 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.5.0.0-1245/hadoop-mapreduce
Note: /tmp/sqoop-acrajesh/compile/dbc445c038084e40909d17a9b95d611e/categories.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
17/03/26 21:26:27 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-acrajesh/compile/dbc445c038084e40909d17a9b95d611e/categories.jar
17/03/26 21:26:27 WARN manager.MySQLManager: It looks like you are importing from mysql.
17/03/26 21:26:27 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
17/03/26 21:26:27 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
17/03/26 21:26:27 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
17/03/26 21:26:27 INFO mapreduce.ImportJobBase: Beginning import of categories
17/03/26 21:26:28 INFO impl.TimelineClientImpl: Timeline service address: http://rm01.itversity.com:8188/ws/v1/timeline/
17/03/26 21:26:28 INFO client.RMProxy: Connecting to ResourceManager at rm01.itversity.com/172.16.1.106:8050
17/03/26 21:26:28 INFO client.AHSProxy: Connecting to Application History server at rm01.itversity.com/172.16.1.106:10200
17/03/26 21:26:28 ERROR tool.ImportTool: Encountered IOException running import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://nn01.itversity.com:8020/user/acrajesh/categories already exists
at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:146)
at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:266)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:139)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308)
at org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:200)
at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:173)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:270)
at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692)
at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:127)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:507)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:615)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)

@acrajesh

If you didn’t specify target-dir then it will default write into hdfs://nn01.itversity.com:8020/user/acrajesh/

Remove the categories folder as per below and re-run your sqoop import.
hadoop fs -rm -R hdfs://nn01.itversity.com:8020/user/acrajesh/categories

2 Likes