RDBMS to Sqoop-hive import data getting performance issue

Hi Every one ,

Currently facing a performance issue while importing a data from mysql server to sqqop-hive data is near about 20GB. performance getting very slow some times query run more then 5 to 6 hour nd 2 days as well.
Using cloud era machine and 4 node cluster and cluster size is 16 GB.

Cloud you please provide me solution or suggestion where i need to look for performance increase while importing data to normal RDBMS to Sqoop-hive.

Could you let us know below details -

  1. How many mappers are you using?
  2. are any other jobs or apps running in the cluster while this job is running?
  3. Is the data in MySQL table skewed based on primary key?

Hi ,
Please see below details
1- we are using 4 mapper .
2- ya mostly running hive, sqoop , pig & mapper & reducer (Name node and data node)
3- No , there is no primary key

Please suggest some suggestion where i need to work …?

  1. Make sure that no jobs are running while this job is running so that maximum resources are allocated to this job.
  2. If there is no primary key, on what basis data is split by? Use boundary query if you do not get desired results by split-by alone.
  3. use – direct option which gives the best possible transfer speed.
  4. Check if you have a good bandwidth.

Can you pls post your sqoop command.

Are you doing incremental import or single table import.