Avro Schema evolution from long to string

#1

Hi,
I have an issue in avro schema evolution. I have and ‘orders’ avro table in hive with below schema:

{  "type" : "record",  "name" : "orders",  "doc" : "Sqoop import of orders",  "fields" : [ {    "name" : "order_id",    "type" : [ "null", "int" ],    "default" : null,    "columnName" : "order_id",    "sqlType" : "4"  }, {    "name" : "order_date",    "type" : [ "null", "long" ],    "default" : null,    "columnName" : "order_date",    "sqlType" :  "93"  }, {    "name" : "order_customer_id",    "type" : [ "null", "double" ],    "default" : null,    "columnName" : "order_customer_id",    "sqlType" : "12"  }, {    "name" : "order_status",    "type" : [ "null", "string" ],    "default" : null,    "columnName" : "order_status",    "sqlType" : "12"  } ],  "tableName" : "orders"} 

Now a newly added file to existing ‘orders_avro’ hive table. having below schema:

{  "type" : "record",  "name" : "orders",  "doc" : "Sqoop import of orders",  "fields" : [ {    "name" : "order_id",    "type" : "int",    "columnName" : "order_id",    "sqlType" : "4"  }, {    "name" : "order_date",    "type" : "string",    "columnName" : "order_date",    "sqlType" : "12"  }, {    "name" : "order_customer_id",    "type" : "double",    "columnName" : "order_customer_id",    "sqlType" : "8"  }, {    "name" : "order_status",    "type" : "string",    "columnName" : "order_status",    "sqlType" : "12"  } ],  "tableName" : "orders"}

whereas the difference between both schema is: order_date is changed to ‘string’ from ‘long’. Due to this reason, i am facing below issue “Failed with exception java.io.IOException:org.apache.avro.AvroTypeException: Found string, expecting long”

Can anybody help me to resolve this issue. Dear @pramodvspk i hope you can resolve this.

Thanks in advance.

0 Likes

#2

What does ‘desc orders’ for order_date col show?
Can you reload the definition with schema change or do you have different avro format schemas being loaded at same time?

0 Likes

#3

Hey @mak21008,

This is the schema of orders table:

order_id int
order_date bigint
order_customer_id double
order_status string

I have reloaded the table with schema change of ‘order_date’ to ‘string’ then it is reading new file without any issues. But when trying to read ‘orders’ table from order_id between 1 and 100 i.e. first records of the table. it is showing error “Failed with exception java.io.IOException:org.apache.avro.AvroTypeException: Found long, expecting string”

In the initial load i’m having ‘order_date’ column as ‘long’ and now it is having ‘string’, so my requirement is to evolve avro schema to load the table with both data types smoothly.

Let me know in case of any additional details required to solve this issue.

0 Likes

#4

You can look here and do some searches. I had seen a similar exception but for a different error.
https://community.hortonworks.com/questions/10120/can-hive-avro-tables-support-changing-schemas.html

1 Like

#5

Hi @ravi.tejarockon, I am not sure of the error i will try to replicate it and let you know.

1 Like