Avro Hive table creation with schema and without schema - partition

Hi @itversity Sir,

Can you please confirm that below method of “Creation of Hive dynamic table partitioned” in avro file format is correct or not and acceptable in CCA175 certification:

Creating Hive table - partitioned in Avro format without schema file:

create table orders_test_avro(order_id int, order_date string, customer_id int) PARTITIONED BY (order_status string) stored as avro;

Loading data dynamically as below:

insert into orders_test_avro PARTITION(order_status) select * from orders;

Data loaded into partitions successfully and able to query without any issues. Data is in Avro format.

P.S: There is no difference between the ‘# Storage Information’ in table description. in SerDe information as well.

Please find the formatted description of table in next post.

(Continuation from above post):
Below is my own procedure
hive> desc formatted orders_test_avro;
OK

col_name data_type comment

order_id int
order_date string
customer_id int

Partition Information

col_name data_type comment

order_status string

Detailed Table Information

Database: retaildb_org
Owner: c968cff1ae084ddb5c658b2ce83350
CreateTime: Fri Feb 03 07:24:27 UTC 2017
LastAccessTime: UNKNOWN
Protect Mode: None
Retention: 0
Location: hdfs://ip-172-31-53-48.ec2.internal:8020/apps/hive/warehouse/retaildb_org.db/orders_test_avro
Table Type: MANAGED_TABLE
Table Parameters:
transient_lastDdlTime 1486106667

Storage Information

SerDe Library: org.apache.hadoop.hive.serde2.avro.AvroSerDe
InputFormat: org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Storage Desc Params:
serialization.format 1
Time taken: 0.07 seconds, Fetched: 33 row(s)

Dynamically Partitioned table in avro format created using your GitHub Code:

hive> desc formatted orders_avro_part;
OK

col_name data_type comment

order_id int
order_date bigint
order_customer_id int
order_status string

Partition Information

col_name data_type comment

order_month string

Detailed Table Information

Database: retaildb_org
Owner: c968cff1ae084ddb5c658b2ce83350
CreateTime: Fri Feb 03 05:42:14 UTC 2017
LastAccessTime: UNKNOWN
Protect Mode: None
Retention: 0
Location: hdfs://ip-172-31-53-48.ec2.internal:8020/apps/hive/warehouse/retaildb_org.db/orders_avro_part
Table Type: MANAGED_TABLE
Table Parameters:
avro.schema.url itversity/avro_files/orders.avsc
transient_lastDdlTime 1486100534

Storage Information

SerDe Library: org.apache.hadoop.hive.serde2.avro.AvroSerDe
InputFormat: org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Storage Desc Params:
serialization.format 1
Time taken: 0.119 seconds, Fetched: 35 row(s)