Sqoop import - split by


#1

Things to remember for Sqoop split by:

  • column should be indexed, otherwise performance will be significantly poor for even medium size tables
  • values in the field should be sparse
  • also often it should be sequence generated or evenly incremented
  • it should not have null values
  • data can be split on non numeric fields such as tax, but we need to use
    -Dorg.apache.sqoop.splitter.allow_text_splitter=true