Data in incorrect format in hive table Avro

hive
avro

#1

I have imported an RDBMS table as avro and created an hive external table on top of that data.
My code looks like this:
sqoop import
–connect jdbc:mysql://nn01.itversity.com:3306/retail_db --username retail_dba --password itversity
–table orders --target-dir /user/phanimandava/orders2
–as-avrodatafile --compress --compression-codec org.apache.hadoop.io.compress.SnappyCodec

hdfs dfs -rm orders.avsc
hdfs dfs -put orders.avsc

CREATE EXTERNAL TABLE ORDERS2 STORED AS AVRO LOCATION ‘/user/phanimandava/orders’ TBLPROPERTIES(‘avro.schema.url’=’/user/phanimandava/orders.avsc’);

It displays BigInt values when tried to retrieve order_date.
When issues describe formatted command, it shows:
order_id int
order_date bigint
order_customer_id int
order_status string

I’m wondering if it is the way how avro data will be imported and how .avsc schema describes the data?

Can anyone please help me out how to read right from avsc and apply correct schema to data.


#2

@itversity @vinodnerella can you please have a look at this one?


#3

@PhaniMandava , i was running through a similar issue today ,

I was trying to create a table on top of orders data stored in avro format itself.

create external table sameerrao20118.order_sqoop
(order_id int,
order_date BIGINT,
order_customer_id int,
order_status string)
stored as avro
location “/user/sameerrao20118/retail_stage.db/orders” ;

It seems Avro file format is strict on data types and uses type conversion in several cases. for eg if you would see the data type of orders from your parquet file isnt the data type of Date ’ Long ’ ?

it seems Long is not recognized by hive /parquet and it converts it to BIGINT , you would notice that other data types have remained unchanged .

Also try creating table on top of it with different data type than the ones menitoned below and you would see the difference.

create external table sameerrao20118.order_sqoop
(order_id int,
order_date BIGINT,
order_customer_id int,
order_status string)
stored as avro
location “/user/sameerrao20118/retail_stage.db/orders” ;