Sqoop export fails for by default updateonly mode


#1

I have tried to export the data from HDFS to MYSQL table using below command, it failed.
sqoop export --connect “jdbc:mysql://quickstart.cloudera:3306/retail_db” --username retail_dba --password cloudera --table departments --export-dir /user/cloudera/sqoop_import/departments_export --batch --outdir java_files
was checking by default ‘updateonly’ mode.
pls see the below job logs:
[cloudera@quickstart ~]$ sqoop export --connect “jdbc:mysql://quickstart.cloudera:3306/retail_db” --username retail_dba --password cloudera --table departments --export-dir /user/cloudera/sqoop_import/departments_export --batch --outdir java_files
Warning: /usr/lib/sqoop/…/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
17/03/23 08:46:46 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.8.0
17/03/23 08:46:46 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
17/03/23 08:46:46 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
17/03/23 08:46:46 INFO tool.CodeGenTool: Beginning code generation
17/03/23 08:46:47 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM departments AS t LIMIT 1
17/03/23 08:46:47 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM departments AS t LIMIT 1
17/03/23 08:46:47 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Note: /tmp/sqoop-cloudera/compile/12d30f6563082380922643875204c2a9/departments.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
17/03/23 08:46:50 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/12d30f6563082380922643875204c2a9/departments.jar
17/03/23 08:46:50 INFO mapreduce.ExportJobBase: Beginning export of departments
17/03/23 08:46:50 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
17/03/23 08:46:51 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
17/03/23 08:46:51 INFO Configuration.deprecation: mapred.map.max.attempts is deprecated. Instead, use mapreduce.map.maxattempts
17/03/23 08:46:52 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
17/03/23 08:46:52 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
17/03/23 08:46:52 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
17/03/23 08:46:52 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
17/03/23 08:46:57 INFO input.FileInputFormat: Total input paths to process : 1
17/03/23 08:46:57 INFO input.FileInputFormat: Total input paths to process : 1
17/03/23 08:46:57 INFO mapreduce.JobSubmitter: number of splits:4
17/03/23 08:46:57 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
17/03/23 08:46:57 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1490282974269_0002
17/03/23 08:46:58 INFO impl.YarnClientImpl: Submitted application application_1490282974269_0002
17/03/23 08:46:58 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1490282974269_0002/
17/03/23 08:46:58 INFO mapreduce.Job: Running job: job_1490282974269_0002
17/03/23 08:47:08 INFO mapreduce.Job: Job job_1490282974269_0002 running in uber mode : false
17/03/23 08:47:08 INFO mapreduce.Job: map 0% reduce 0%
17/03/23 08:47:44 INFO mapreduce.Job: map 100% reduce 0%
17/03/23 08:47:45 INFO mapreduce.Job: Job job_1490282974269_0002 failed with state FAILED due to: Task failed task_1490282974269_0002_m_000001
Job failed as tasks failed. failedMaps:1 failedReduces:0

17/03/23 08:47:46 INFO mapreduce.Job: Counters: 9
Job Counters
Failed map tasks=1
Killed map tasks=3
Launched map tasks=4
Data-local map tasks=4
Total time spent by all maps in occupied slots (ms)=128106
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=128106
Total vcore-seconds taken by all map tasks=128106
Total megabyte-seconds taken by all map tasks=131180544
17/03/23 08:47:46 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
17/03/23 08:47:46 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 53.5378 seconds (0 bytes/sec)
17/03/23 08:47:46 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
17/03/23 08:47:46 INFO mapreduce.ExportJobBase: Exported 0 records.
17/03/23 08:47:46 ERROR tool.ExportTool: Error during export: Export job failed!

input file is there with below data
7,Fan Shop
100,testing export

please help me to find what goes wrong.


#2

try --update-key, --input-fields-terminated-by
–input-lines-terminated-by sqoop parameters and check
http://archive-primary.cloudera.com/cdh5/cdh/5/sqoop-1.4.5-cdh5.3.2/SqoopUserGuide.html#_literal_sqoop_export_literal


#3

I am encountering the same issue.

I have executed the following command:
sqoop export with updateonly mode is getting timeout and failing.
sqoop export
–options-file connect.txt
–table orders_csm
–export-dir retail_db/orders
–update-key order_id
–update-mode updateonly

Output
18/10/05 01:49:17 INFO mapreduce.Job: map 0% reduce 0%
18/10/05 01:49:28 INFO mapreduce.Job: map 1% reduce 0%
18/10/05 01:49:40 INFO mapreduce.Job: map 2% reduce 0%
18/10/05 01:50:56 INFO mapreduce.Job: map 3% reduce 0%
18/10/05 01:52:06 INFO mapreduce.Job: map 4% reduce 0%
18/10/05 01:53:22 INFO mapreduce.Job: map 5% reduce 0%
18/10/05 01:54:32 INFO mapreduce.Job: map 6% reduce 0%


#4

More observations.

in the prevous example i have used a table with no primary key and it took forever to export using updateonly mode and finally it countered timeout.
In the following expample I have used a mysql table with primary key. I took long to complete but it was faster and didn’t enocunter failure.
Can anyone help me to know why the updateonly mode is very slow?

18/10/05 02:11:20 INFO mapreduce.Job: map 0% reduce 0%
18/10/05 02:11:31 INFO mapreduce.Job: map 14% reduce 0%
18/10/05 02:11:34 INFO mapreduce.Job: map 22% reduce 0%
18/10/05 02:11:37 INFO mapreduce.Job: map 27% reduce 0%
18/10/05 02:11:40 INFO mapreduce.Job: map 34% reduce 0%
18/10/05 02:11:43 INFO mapreduce.Job: map 42% reduce 0%
18/10/05 02:11:46 INFO mapreduce.Job: map 47% reduce 0%
18/10/05 02:11:47 INFO mapreduce.Job: map 49% reduce 0%
18/10/05 02:11:49 INFO mapreduce.Job: map 54% reduce 0%
18/10/05 02:11:50 INFO mapreduce.Job: map 57% reduce 0%
18/10/05 02:11:52 INFO mapreduce.Job: map 58% reduce 0%
18/10/05 02:11:53 INFO mapreduce.Job: map 61% reduce 0%
18/10/05 02:11:56 INFO mapreduce.Job: map 65% reduce 0%
18/10/05 02:11:59 INFO mapreduce.Job: map 70% reduce 0%
18/10/05 02:12:02 INFO mapreduce.Job: map 74% reduce 0%
18/10/05 02:12:05 INFO mapreduce.Job: map 78% reduce 0%
18/10/05 02:12:08 INFO mapreduce.Job: map 82% reduce 0%
18/10/05 02:12:11 INFO mapreduce.Job: map 86% reduce 0%
18/10/05 02:12:14 INFO mapreduce.Job: map 89% reduce 0%
18/10/05 02:12:17 INFO mapreduce.Job: map 91% reduce 0%
18/10/05 02:12:20 INFO mapreduce.Job: map 92% reduce 0%
18/10/05 02:12:23 INFO mapreduce.Job: map 93% reduce 0%
18/10/05 02:12:27 INFO mapreduce.Job: map 94% reduce 0%
18/10/05 02:12:31 INFO mapreduce.Job: map 95% reduce 0%
18/10/05 02:12:34 INFO mapreduce.Job: map 96% reduce 0%
18/10/05 02:12:37 INFO mapreduce.Job: map 97% reduce 0%
18/10/05 02:12:40 INFO mapreduce.Job: map 98% reduce 0%
18/10/05 02:12:43 INFO mapreduce.Job: map 99% reduce 0%
18/10/05 02:12:45 INFO mapreduce.Job: map 100% reduce 0%
18/10/05 02:12:46 INFO mapreduce.Job: Job job_1537636876515_1992 completed successfully