Extraneous Information - CCA Question



I am working on a question where I need to import certain columns from a table into HDFS.

The table in question is on mySQL, however the question begins with:

Data is available on local file system /data/… etc., with no mention of mySQL, just the specifications of the output like compression and field-terminated-by.

Is it OK just to use a Sqoop import to solve this issue and ignore the extraneous information concerning the data on the local directory?



If the table exists in MySQL then you can directly use sqoop import if not export the data to MySQL from HDFS and then import it back to the HDFS with the required output.