Avro data file extension is not avsc

I have followed the below video

  1. In the video, the avrodatafile extension will be .avsc but when i tried it showed up with .avro extension and searched in the http://www.cloudera.com/documentation/enterprise/latest/topics/cdh_ig_avro_usage.html link it strictly mentioned that file extension should be .avro.

  2. In the video it mentioned that after done importing the avrodatafile we can see the output file local machine but i couldn’t see any file like that. Even If it is generate a new file in local file system how that file will generate in local file system without giving any coding.

I am practicing on Cloudera-manager

Can anyone please answer to these questions.

Thank you

Table you are importing, that table data will be in the .avro extension and it will at the location whichever you have mentioned in your import statement.
The .avsc file is schema file of that table, which will be in home dir or dir from which you have imported the table in to hdfs.

Thank you N_Chakote. I got the clarification on avrodatafile extension but i couldn’t understood one thing about .avsc. I have searched through my local file system but couldn’t find that kind of file. if you don’t mind could you please explain about it little bit more.

Thank you,
Chandana

@chandana204 most likely it will be in the directory from where you have executed sqoop script

@chandana204 .avsc file will have the underlying schema of the table and it will not be present in the LFS (Local File System) but will be present in HDFS. You can get the .avsc file location, by doing “desc formatted Your_Database_name.Your_Table_Name” in the hive or beeline. Hope this helps!

@ashok_singamaneni here the question is @chandana204 is unable to find the respective.avsc files for the sqoop import

Thank you all for responses.

If i imported the file in hdfs how will i get .avsc file in hive/beelin? Sorry to keep asking same question again and again.
Below i’ll mention the exact process what i have done to find the .avsc file.

sqoop import --connect “jdbc:mysql://quickstart.cloudera/retail_db” --username retail_dba --password cloudera --table MyFamily --target-dir /user/cloudera/MyFamily_Avrodatafile --as-avrodatafile --num-mappers 1 --outdir java_files

I found the output file under /user/cloudera/MyFamily_Avrodatafile/part-m-00000.avro

I can do up to this point but i’m not sure where to find .avsc file.
can you guys please mention according to the above command where can i see the .avsc file?

Thank you,
Chandana

@chandana204: Sorry for the confusion in my previous answer. So can you tell us from which local directory did you run the sqoop import command? You will find the .avsc file, in the same directory from where you ran the sqoop import. After sqoop import execution is done, just do ls -ltr in the same directory, and you will find the .avsc file.

I got that point theoretically. But in practical i am not sure.
Can you please have look on the before comment. I have mentioned the procedure what i had approached. Please clarify my doubt accordingly the command what i have mentioned in it.

@chandana204 before submit the sqoop import please excute the commad “pwd” and if you use the same location to execute your sqoop import. And you will see a file with .avsc in that directory. And the .avro files will be imported to hdfs as per the target directory. Please note that .avsc will not be moved into HDFS so you have to do manually.
If you want to extract an avro schema from the .avro data file then you need not to bring the HDFS file to local file system and use avro tools

@chandana204 : I think .avsc file creation path doesn’t depend on the command you ran, but depends on the path from which you executed this sqoop command.

Ok. My pwd location is /home/cloudera and i am using same location to execute sqoop import. I have searched at that location also. But no file is showing up.

@chandana204 :slight_smile: I am sure but try with the user root in the sqoop import. Let’s see

Same as before.:confused:

@chandana204 You know what… there is a similar post .avsc file not found after sqoop import using --as-avrodatafile

@chandana204,
And also refer the link https://community.cloudera.com/t5/Hadoop-101-Training-Quickstart/Error-in-exercises-1-and-2-No-avsc-generated-and-Path-does-not/td-p/35121

i have not faced this problem
also this MyFamily table you have mentioned , is it a table in the retail_db . because i haven’t came across it.

if this is a valid table then try this way.
create another directory say " sqoop_import"
cd in to sqoop_import, now your working directory should be /home.cloudera/sqoop_import.
try running your sqoop import from this directory. avsc file should be present here,

:joy: At last i found the .avsc file at /tmp/sqoop-cloudera/compile location. But this file is named as input file name.

@Avinash_Parida I created MyFamily table in the retail_db.

Thank you all for your great support.