Load .DAT file into HDFS


#1

Hi @dgadiraju sir,

    I want upload .DAT file into HDFS and HIVE. How can it possible? Could you please advise.

Thanks
Ramkumar V


#2

Your question is abstract.

You could put any type of file in HDFS using the following command:

hdfs dfs -copyFromLocal <your_local_path> <hdfspath>

However, if you want to load the contents of the .dat file in Hive, then first your need to understand and know the type and format of the data in .dat file.
If the data is generic textual data separated by a comma, then you can create a Hive table matching the schema of the data, and use the following command to load the data into Hive table:

LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)]

[LOCAL] is required when the data file is in your local filesystem.


#3

Thanks gurpreet,

   This is not generic textual data. For generic text seperated by comma then we can directly move to hadoop. I have attached the .dat file. Could you please advise how can we load. Thanks in advance.


#5

For that, you have to understand and use Regular expressions :

Your dat file is separated by some special characters like ETX, SOH, EOT, ACX, BS etc. You will find the Ascii codes for these characters in the following link:

https://ascii.cl/

Now, using \u or \x in Regular expressions, you could create an expression and use Serde properties in Create table in hive command.

https://community.hortonworks.com/articles/58591/using-regular-expressions-to-extract-fields-for-hi.html


#6

Very thanks gurpreet