Incremental update in sqoop

cca-175
hive
sqoop
#1

I am trying to update a hive table, based on the records from a mysql table.

mysql-table: (table name: delimiter_test)

±--------------±----------------+
| department_id | department_name |
±--------------±----------------+
| 2 | Fitness |
| 3 | Footwear |
| 4 | Apparel |
| 5 | Golf |
| 6 | Outdoors |
| 7 | Fan Shop |
| 8 | Test |
±--------------±----------------+

hive-table (table name: my_test)

2 Fitness
3 Footwear
4 Apparel
5 Golf
6 Outdoors
7 Fan Shop

I am trying to use sqoop, to import the last record in the mysql table with department_id 8, into hive table using incremental-update in sqoop.

my-sqoop command:

sqoop import --connect “jdbc:mysql://quickstart.cloudera:3306/retail_db” --username xxx --password xxx --table delimiter_test --hive-import --hive-table my_test --split-by department_id --check-column department_id --incremental append --last-value 7

I am not getting any errors, but the extra record from the mysql table with department_id 8 is getting updated into the hive table only when the number of mappers are (–m) 6, and if it is 10, the record is getting updated twice.

Please suggest me where am I going wrong.

0 Likes

#2

I have changed category to Apache Sqoop.

certifications category is primarily to log issues related to certifications in general - such as challenges, fee, exam structure etc.

0 Likes