Json file loading to Hive

I have a JSON file in the below format and need to load data into HIVE:

{
“sourceFiles”: [
{
“compressType”: “”,
“ingestMetadata”: “xxxxxxxxxxxxxx.xml”,
“fileCompressed”: “n”
},
{
“compressType”: “”,
“ingestMetadata”: “xxxxxxxxxxxxxx.xml”,
“fileCompressed”: “n”
}

]

}

Hi,

Do you mean to say the JSON file represents the schema of your data and you are planning to create an avro table in hive and load data to that table?

Hi Santy,

My source file is in JSON format and I need to create table out of this - stored as clause can be anything … Let’s say text file.
I need to extract the headers and load data. For further analysis.

Please understand the data is not in single line, if so we can load complete data into a string and then use lateral view to extract / provide structure . This is in multi line format.

Thank you for your response - please let me know if this is not informative / confusing - i can provide more information.

@ramesh.devarakonda
You can create a Hive Table as

CREATE TABLE sourcefiles
(
compresstype STRING,
ingestmetadata STRING,
filecompressed STRING
)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe’
STORED AS TEXTFILE;

you can load the source .json by using the way you like i.e external table or load data

please make sure that HIVE Table column names are matching exactly with the .json file

In your case, the input .json has Array Of (Structures) so you also need to create a Hive table with that structure.

Hello Ramesh,
I suggest you to forget about the .json file and use avro data and get them to local directory and by using avro -tools extract the schema out of it and and then place the avro schema back into HDFS and then create table;
CREATE EXTERNAL TABLE xxxxxxxx
LOCATION 'hdfs:///user/hive/warehouse/------------'
TBLPROPERTIES (‘avro.schema.url’=‘hdfs://quickstart.cloudera/user/cloudera/file.avsc’);

The client supplied file is in .JSON file, this is a marker file generated by a third party tool.
Not sure how can I ignore the JSON and switch to Avro ??
Is there a way to convert JSON to Avro ??

@email2dgk,

could you please give some syntax help?
Are you referring to hive struct < Array >,?

I have given file sample , if you can give me some syntax help- it would be great I am trying but not sure if I am on monotonous lines may be… Mean while I will check your solution steps too

Hello ramesh, you should always used to check what kind of data they gave as a part of create table command.if the data is in avro format while creating a table you should specifie as stored as avro if you say as text file for avro then you will be getting null values in the data and also you should check the delimiters in the data mostly they will specifie if they dont specifie just use cat command or tail command on that particular data and look at the delimiters.So your solution should be like this
create external table()
row format delimited fields terminated by ‘,’ —assuming data is delimited by ,
stored as avrofile — assuming the data in avro format
LOCATION 'hdfs:///user/hive/warehouse/------------'
TBLPROPERTIES (‘avro.schema.url’=‘hdfs://quickstart.cloudera/user/cloudera/file.avsc’);

Ramesh / Sowmya

have you been able to load json data to hive ?

JSON data loading using Hive / Scala is best use case to learn. Could you please post how to steps ?

-Srinivas

@Srinivas Follow the below link…you will get a clear idea on it.


You need to add jsonserde jar to hive inorder to work on it.

I’m trying this json file to load in hive to practce.
Can you tell me how to add jar file to use JSONSerde.

@itversity