**FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask**


#1

Hello Everyone

While triggering following insert query insert into table orders_sequence select * from orders_another;
i am getting following error messages.

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask

My objectives here is to enable the compression on hive and check the fille size after compression. But the problem here is after enabling compression performing insert command is not working. I have attached the screenshot below where you can find the detail steps that i have trigger for enabling the compression and the error that has occurred after insert command.

hive> desc orders_sequence;
OK
order_id string
order_date string
order_customer_id int
order_status varchar(45)
Time taken: 0.115 seconds, Fetched: 4 row(s)

hive> desc orders_another;
OK
order_id string
order_date string
order_customer_id int
order_status varchar(45)
Time taken: 0.113 seconds, Fetched: 4 row(s)
hive>

hive> select * from orders_another limit 5;
OK
1 2013-07-25 00:00:00.0 11599 CLOSED
2 2013-07-25 00:00:00.0 256 PENDING_PAYMENT
3 2013-07-25 00:00:00.0 12111 COMPLETE
4 2013-07-25 00:00:00.0 8827 CLOSED
5 2013-07-25 00:00:00.0 11318 COMPLETE
Time taken: 0.223 seconds, Fetched: 5 row(s)
hive>

Note : Issue occurs after below mentioned command is trigger where i have simply enable mapreduce, changed the compression to Snappy and enabled compression. After this, when i run insert command which is mentioned below it shows an FAILED error which is highlighted with red circle

hive> SET mapreduce.output.fileoutputformat.compress.codec;
mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.DefaultCodec
hive> SET mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.SnappyCodec;
hive> SET mapreduce.output.fileoutputformat.compress.codec;
mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.SnappyCodec
hive> set hive.exec.compress.output=true;
hive> set hive.exec.compress.output;
hive.exec.compress.output=true
hive> insert into table orders_sequence
> select * from orders_another;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = hduser_20170910145414_73c534b0-c49d-43d9-940a-92e64d5caf5c
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there’s no reduce operator
Job running in-process (local Hadoop)
2017-09-10 14:54:16,223 Stage-1 map = 0%, reduce = 0%
Ended Job = job_local221817572_0002 with errors
Error during job, obtaining debugging information…
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

I even checked following status as well in hive

hive> set mapred.reduce.tasks;
mapred.reduce.tasks=-1

hive> set hive.exec.reducers.max;
hive.exec.reducers.max=1009

hive> set hive.exec.reducers.bytes.per.reducer;
hive.exec.reducers.bytes.per.reducer=256000000

hive> set mapred.tasktracker.reduce.tasks.maximum;
mapred.tasktracker.reduce.tasks.maximum=2
hive>

hive> set hive.auto.convert.join;
hive.auto.convert.join=true

hive> set hive.auto.convert.join=false;

hive> set hive.auto.convert.join;
hive.auto.convert.join=false

hive> set hive.auto.convert.join=true;

hive> set hive.exec.dynamic.partition;
hive.exec.dynamic.partition=true;

hive> set hive.exec.dynamic.partition.mode;
hive.exec.dynamic.partition.mode=strict

hive> set hive.exec.dynamic.partition.mode=nonstrict;

hive> set hive.exec.dynamic.partition.mode;
hive.exec.dynamic.partition.mode=nonstrict
hive>
Any help will be highly appreciated.


#2

Can you check and put log of the failed job here?


#3

Hi Shivendra
There is no log file created for the below mentioned command/job. I did not find the log path after i trigger below mentioned command or do you mean about job_local221817572_0002 Please advise if i am mistaken.

hive> insert into table orders_sequence
** select * from orders_another;**
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = hduser_20170910145414_73c534b0-c49d-43d9-940a-92e64d5caf5c
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there’s no reduce operator
Job running in-process (local Hadoop)
2017-09-10 14:54:16,223 Stage-1 map = 0%, reduce = 0%
Ended Job = job_local221817572_0002 with errors
Error during job, obtaining debugging information…
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec


#5

It worked for me… I set below parameters
hive.exec.compress.output=false
hive.exec.orc.default.compress=SNAPPY

The parameter mapreduce.output.fileoutputformat.compress.codec should be used for the real mapreduce jobs. In this case it is a Hive job.
parm mapreduce.output.fileoutputformat.compress.codec in mapreduce is equivalent to hive.exec.orc.default.compress in hive (Please correct if I am wrong)

created below table…
create external table products_snappy (
product_id int,
product_category_id int,
product_name string,
product_description string,
product_price double,
product_image string)
row format delimited fields terminated by ','
location ‘hdfs://nn01.itversity.com:8020/user/USER_ID/sqoop/products_snappy’;

insert into table products_snappy select * from products_ext;

dfs -ls hdfs://nn01.itversity.com:8020/user/USER_ID/sqoop/products_snappy;
-rwxr-xr-x 3 USER_ID hdfs 27957 2017-09-11 11:48 hdfs://nn01.itversity.com:8020/user

dfs -ls hdfs://nn01.itversity.com:8020/user/USER_ID/sqoop/products;
-rw-r–r-- 3 USER_ID hdfs 173993 2017-09-11 11:13 hdfs://nn01.itversity.com:8020/user


#4

Yes. You can go to jobtracker to check the log of the last run job. It must be under job logs.


#6

Hi Shivendra
If you check this url Durga has explained about compression in this video. You can go directly at 50:10

From the video what i have understood is in order to enable compression codec i have to make the status of set hive.exec.compress.output to true and also i have default codec to snappy codec.

So as suggested in the video i created a table as following

hive> create table orders_sequence (
order_id string,
order_date string,
order_customer_id int,
order_status varchar(45))
row format delimited fields terminated by ‘|’
stored as sequencefile;
OK
Time taken: 1.286 seconds
hive>

And then i changed the default codec and enable compression output as following

hive> SET mapreduce.output.fileoutputformat.compress.codec;
mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.DefaultCodec

As you can observe from above its showing defaultcodec but if you wanted to change it to Snappy Codec then you can write in the following way.
hive> SET mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.SnappyCodec;

hive> SET mapreduce.output.fileoutputformat.compress.codec;
mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.SnappyCodec
hive>

hive> set hive.exec.compress.output=true;
hive> set hive.exec.compress.output;

And then i insert the record into the table as following

hive> insert into table orders_sequence

select * from orders_another;

So during compression enabling the size of the file is like following

hive> describe formatted orders_sequence;
OK

col_name data_type comment

order_id string
order_date string
order_customer_id int
order_status varchar(45)

Detailed Table Information

Database: ujjwal
Owner: hduser
CreateTime: Mon Sep 11 15:00:43 SGT 2017
LastAccessTime: UNKNOWN
Retention: 0
Location: hdfs://storage.castrading.com:9000/home/hduser/hive/datawarehouse/ujjwal.db/orders_sequence
Table Type: MANAGED_TABLE
Table Parameters:
COLUMN_STATS_ACCURATE {“BASIC_STATS”:“true”}
numFiles 1
numRows 68883
org.compress SNAPPY
rawDataSize 2931061
totalSize 3864387
transient_lastDdlTime 1505186756

Storage Information

SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat: org.apache.hadoop.mapred.SequenceFileInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Storage Desc Params:
field.delim |
serialization.format |
Time taken: 0.335 seconds, Fetched: 35 row(s)
hive>

But only after disabling set hive.exec.compress.output to FALSE and then when i again re-run following insert command i am getting the issue so called FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask . What could be the reason ?

hive> insert into table orders_sequence select * from orders_another;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = hduser_20170910145414_73c534b0-c49d-43d9-940a-92e64d5caf5c
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there’s no reduce operator
Job running in-process (local Hadoop)
2017-09-10 14:54:16,223 Stage-1 map = 0%, reduce = 0%
Ended Job = job_local221817572_0002 with errors
Error during job, obtaining debugging information…
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec


#7

hi bro,

i am also facing the same issue could you please tell the root cause solution if you got the solution.

thanks