Sqoop import create .avsc and .java files why?

Hi Friends,

While using sqoop import-all-tables in Avro file format, it will create .avsc and .java files. what is the importance of these files? why it won’t write them into hdfs instead of a local file system. Please explain.


During sqoop import in avro format, we can see 3 types of files being created.

  1. .avsc - avro schema file required to do the import in avro format
  2. .java - the java file containing the java program that runs in background as part of the import
  3. .avro - actual avro data file

The .avro files are the actual files containing the data imported in Avro format and those would be stored in the HDFS location specified in the warehouse-dir. However, the .avsc and .java files would be stored in the local rather than HDFS. This is because the .avsc and .java files act as tools required to do the import and are not the actual data imported. Sqoop uses the schema file to import the data in avro format and java to do the actual import. Only the DATA that is imported would be stored in the HDFS location.