Where is and how to access yarn log? Thanks


#1

Hello,

I am learning sqoop export and here is my command:

sqoop export -m 1
–connect jdbc:mysql://ms.itversity.com:3306/retail_export
–username=
–password=
–table dep_export
–export-dir departments_new
–input-lines-terminated-by ‘\n’
–input-fields-terminated-by ‘|’

This command means to export the data on hdfs to mysql but it failed for the following details:

Error: java.io.IOException: Can’t export data, please check failed map task logs at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:122) at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) Caused by: java.lang.RuntimeException: Can’t parse input data: ‘10,physicss,2018-01-31 22:24:18.0’ at dep_export.__loadFromFields(dep_export.java:316) at dep_export.parse(dep_export.java:254) at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:89) … 10 more Caused by: java.lang.NumberFormatException: For input string: “10,physicss,2018-01-31 22:24:18.0” at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:580) at java.lang.Integer.valueOf(Integer.java:766) at dep_export.__loadFromFields(dep_export.java:303) … 12 more

I want to check the yarn log and see what’s the real error, I am on gw03.itversity.com which is also a YARN client.

From the config it seems the log locates at:


I don’t see those directory on the host.

I also tried to see the job history: and I was not able to reach the site:

Can anyone help to sort it out?

Thank you very much.


#2

@paslechoix This is the jobhistory server UI link http://rm01.itversity.com:19888/jobhistory


#3

Thank you, there isn’t much on the Job History server, can anyone help? failed again just now.

image

http://rm01.itversity.com:19888/jobhistory/job/job_1517228278761_16138


#4

While trying to retrieve and examine the real log file through FileZilla, I am not able to connect to the job history server with my lab credential, can you tell me what’s the right credential I need to use and what’s the parameters for the FileZilla?

The reason I want to access the log through FileZilla is the log is too big to show up in browser:

mapred-mapred-historyserver-rm01.itversity.com.log 242651315 bytes Feb 26, 2018 7:14:33 AM

Thank you very much.


#5

@paslechoix You can connect only on the assigned gateway. If you want log file you can go to job history server or you can use the URL which created at the time of running Job.Simplly coppy and paste that URL to check your log.


#6

Thank you, as I indicated in the previous post, the log is huge 242651315 bytes and that’s why I wonder it would be better if I can download it first.


#7

From the error log it is clear that you are using incorrect fields termination. Use
–input-fields-terminated-by ‘,’
instead of
–input-fields-terminated-by ‘|’


#8

Thank you for your reply:
The reason I put option
–input-fields-terminated-by ‘|’
is I want to create PSV format instead of CSV format. I think it really should not matter but anyway I used your comma for the following testing and it failed.

Let me go through this again:

I have a folder in hdfs that contains data and I want to export that to mysql table:
hdfs:
[paslechoix@gw03 ~]$ hdfs dfs -cat departments_new1/*
2,Fitness,2018-02-25 18:28:34.0
3,Footwear,2018-02-25 18:28:34.0
4,Apparel,2018-02-25 18:28:34.0
5,Golf,2018-02-25 18:28:34.0
6,Outdoors,2018-02-25 18:28:34.0
7,Fan Shop,2018-02-25 18:28:34.0
2,Fitness,2018-02-25 18:28:34.0
3,Footwear,2018-02-25 18:28:34.0
4,Apparel,2018-02-25 18:28:34.0
5,Golf,2018-02-25 18:28:34.0
6,Outdoors,2018-02-25 18:28:34.0
7,Fan Shop,2018-02-25 18:28:34.0
10,gyshics,2018-02-25 18:28:34.0
11,chemistry,2018-02-25 18:28:34.0
12,math,2018-02-25 18:28:34.0
13,science,2018-02-25 18:28:34.0
14,engineering,2018-02-25 18:28:34.0
111,TBD,2018-02-25 18:37:14.0
113,Pharma,2018-02-25 18:37:14.0
114,TBD,2018-02-25 18:37:16.0
2,Fitness,2018-02-25 18:28:34.0,999
3,Footwear,2018-02-25 18:28:34.0,999
4,Apparel,2018-02-25 18:28:34.0,999
5,Golf,2018-02-25 18:28:34.0,999
6,Outdoors,2018-02-25 18:28:34.0,999
7,Fan Shop,2018-02-25 18:28:34.0,999
10,gyshics,2018-02-25 18:28:34.0,999
11,chemistry,2018-02-25 18:28:34.0,999
12,math,2018-02-25 18:28:34.0,999
13,science,2018-02-25 18:28:34.0,999
14,engineering,2018-02-25 18:28:34.0,999
111,TBD,2018-02-25 18:37:14.0,999
113,Pharma,2018-02-25 18:37:14.0,999
114,TBD,2018-02-25 18:37:16.0,999
2,Fitness,2018-02-25 18:28:34.0,999
3,Footwear,2018-02-25 18:28:34.0,999
4,Apparel,2018-02-25 18:28:34.0,999
5,Golf,2018-02-25 18:28:34.0,999
6,Outdoors,2018-02-25 18:28:34.0,999
7,Fan Shop,2018-02-25 18:28:34.0,999
10,gyshics,2018-02-25 18:28:34.0,999
11,chemistry,2018-02-25 18:28:34.0,999
12,math,2018-02-25 18:28:34.0,999
13,science,2018-02-25 18:28:34.0,999
14,engineering,2018-02-25 18:28:34.0,999
111,TBD,2018-02-25 18:37:14.0,999
113,Pharma,2018-02-25 18:37:14.0,999
114,TBD,2018-02-25 18:37:16.0,999
113,Pharma,2018-02-25 19:02:48.0,999
114,TBD,2018-02-25 19:02:54.0,999

export command:

sqoop export -m 1
–connect jdbc:mysql://ms.itversity.com:3306/retail_export
–username=retail_user
–password=itversity
–table dep_export
–export-dir departments_new1
–input-lines-terminated-by ‘\n’
–input-fields-terminated-by ‘,’

Result:
18/02/28 07:14:14 INFO mapreduce.Job: Running job: job_1517228278761_22227
18/02/28 07:14:23 INFO mapreduce.Job: Job job_1517228278761_22227 running in uber mode : false
18/02/28 07:14:23 INFO mapreduce.Job: map 0% reduce 0%
18/02/28 07:14:30 INFO mapreduce.Job: map 100% reduce 0%
18/02/28 07:14:31 INFO mapreduce.Job: Job job_1517228278761_22227 failed with state FAILED due to: Task failed task_1517228278761_22227_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

18/02/28 07:14:31 INFO mapreduce.Job: Counters: 8
Job Counters
Failed map tasks=1
Launched map tasks=1
Other local map tasks=1
Total time spent by all maps in occupied slots (ms)=9218
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=4609
Total vcore-milliseconds taken by all map tasks=4609
Total megabyte-milliseconds taken by all map tasks=9439232
18/02/28 07:14:31 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
18/02/28 07:14:31 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 43.8293 seconds (0 bytes/sec)
18/02/28 07:14:31 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
18/02/28 07:14:31 INFO mapreduce.ExportJobBase: Exported 0 records.
18/02/28 07:14:31 ERROR mapreduce.ExportJobBase: Export job failed!
18/02/28 07:14:31 ERROR tool.ExportTool: Error during export: Export job failed!

Any more idea?

Thank you very much for your help.


#9

Hi,
Export can fail due to one of the reason. Check the file content to see if there are unnecessary blank rows.
http://sqoop.apache.org/docs/1.4.4/SqoopUserGuide.html#_exports_and_transactions

 Loss of connectivity from the Hadoop cluster to the database (either due to hardware fault, or server software crashes)
Attempting to INSERT a row which violates a consistency constraint (for example, inserting a duplicate primary key value)
Attempting to parse an incomplete or malformed record from the HDFS source data
Attempting to parse records using incorrect delimiters
Capacity issues (such as insufficient RAM or disk space) 

I followed below steps and it worked for me.

Create departments.txt file and add content.

[root@quickstart spark]# hadoop fs -mkdir /user/cloudera/problem12/departments_new/
[root@quickstart spark]# hadoop fs -put departments.txt /user/cloudera/problem12/departments_new/
[root@quickstart spark]# hadoop fs -ls /user/cloudera/problem12/departments_new/
Found 1 items
-rw-r–r-- 1 root cloudera 1734 2018-02-28 07:18 /user/cloudera/problem12/departments_new/departments.txt

sqoop export --connect jdbc:mysql://quickstart:3306/retail_db --username retail_dba --password cloudera
–table departments_new
–export-dir ‘/user/cloudera/problem12/departments_new’
–input-fields-terminated-by ‘,’
–input-lines-terminated-by ‘\n’