Let us understand the execution flow of sqoop import.
- Get metadata of the table by running simple query.
- Build the POJO class with appropriate getters and setters.
- Compile the POJO class into jar file
- Run boundary vals query or boundary query to get min and max by split column (default is primary key column).
- Compute split size max - min.
- Divide it with number of mappers and compute splits.
- Submit map reduce job with number of mappers equal to 4 by default.
- Each map task will run select query on the source table with where condition based on the splits to read the data.
- Data will be written to the files in the location specified.