MapReduce job Map Task get stuck at 67%



I have created MapReduce job in Java in Eclipse and created MapReduce job for wordcount which reads data (approximate 7453215 record - 670MB) from SQL server and store result back to SQL server. I have created HDInsight cluster on azure which has 2 head node and 3 worker nodes. Each node has 4 cores and 14GB RAM. MapReduce job running successfully on a local machine but while I am submitting jar file of MapReduce job to HDInsight cluster on Azure then it stopped on map task at 67%.

Here is the log,

17/12/01 13:23:20 INFO client.AHSProxy: Connecting to Application History server at headnodehost/ 17/12/01 13:23:21 INFO client.RequestHedgingRMFailoverProxyProvider: Looking for the active RM in [rm1, rm2]… 17/12/01 13:23:21 INFO client.RequestHedgingRMFailoverProxyProvider: Found active RM [rm2] 17/12/01 13:23:21 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this. 17/12/01 13:23:36 INFO mapreduce.JobSubmitter: number of splits:2 17/12/01 13:23:37 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1512119994740_0011 17/12/01 13:23:37 INFO impl.YarnClientImpl: Submitted application application_1512119994740_0011 17/12/01 13:23:37 INFO mapreduce.Job: The url to track the job: 17/12/01 13:23:37 INFO mapreduce.Job: Running job: job_1512119994740_0011 17/12/01 13:23:47 INFO mapreduce.Job: Job job_1512119994740_0011 running in uber mode : false

17/12/01 13:23:47 INFO mapreduce.Job: map 0% reduce 0%
17/12/01 13:24:00 INFO mapreduce.Job: map 33% reduce 0%
17/12/01 13:24:06 INFO mapreduce.Job: map 67% reduce 0%

And I am getting this error :

2017-12-01 12:56:41,303 INFO [RMCommunicator Allocator] After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:1 AssignedMaps:0 AssignedReds:0 CompletedMaps:2 CompletedReds:0 ContAlloc:2 ContRel:0 HostLocal:0 RackLocal:0 2017-12-01 12:56:41,304 INFO [AsyncDispatcher event handler] Diagnostics report from attempt_1512119994740_0008_m_000001_0: Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143


When u write from Database to HDFS , is it working?


I am not writing anything to the HDFS. I am connecting to the SQL server database processing the data and storing output back to the SQL server database, all these steps are performing from single map reduce job. I am writing this MapReduce Job in Java in Eclipse.Then I am creating the Runnable jar and executing on HDInsight Cluster on Azure.

Following is my Main Class:

package com.hadoop;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.db.DBConfiguration;
import org.apache.hadoop.mapreduce.lib.db.DBInputFormat;
import org.apache.hadoop.mapreduce.lib.db.DBOutputFormat;
import org.apache.hadoop.mapreduce.lib.db.DataDrivenDBInputFormat;

public class Main {
  public static void main(String[] args) throws Exception
     Configuration conf = new Configuration();
      System.out.println("connected to db");
      Job job = Job.getInstance(conf);
      DataDrivenDBInputFormat.setInput(job, DBInputWritable.class,
            "SELECT * FROM Flight WHERE Id IS NOT NULL",
            "SELECT MIN(Id),MAX(Id) FROM Flight"); 
      System.out.println("execute the query");
      new String[]  {"FlightNum" ,"cnt"} 
      System.exit(job.waitForCompletion(true) ? 0 : 1);