Sqoop Import Fail Tinyint(1) Data Type

#1

Hello Friends

I am new to Big Data and Sqoop. I just started learning this new wonderful technology.

So, I am getting an error while loading the data from MySQL to HDFS. I am trying to move the whole database, in that database I have 20 tables, all the tables have been imported except one, I dug into the table where I am getting issue. So, I found that one of the columns of this table has a data type called Tinybit(1). Here is the code:

sqoop import
–connect jdbc://localhost.localdomain:3306/databasename
–username root
-P
–table sales_orders

I read on Apache Sqoop website about Tinyint(1) this:

MySQL: Import of TINYINT(1) from MySQL behaves strangely

Problem: Sqoop is treating TINYINT(1) columns as booleans, which is for example causing issues with HIVE import. This is because by default the MySQL JDBC connector maps the TINYINT(1) to java.sql.Types.BIT, which Sqoop by default maps to Boolean.

Solution: A more clean solution is to force MySQL JDBC Connector to stop converting TINYINT(1) to java.sql.Types.BIT by adding tinyInt1isBit=false into your JDBC path (to create something like jdbc:mysql://localhost/test?tinyInt1isBit=false). Another solution would be to explicitly override the column mapping for the datatype TINYINT(1) column. For example, if the column name is foo, then pass the following option to Sqoop during import: --map-column-hive foo=tinyint. In the case of non-Hive imports to HDFS, use --map-column-java foo=integer.

I tried to use above solution too, but that did not work too.

Can anyone help me out in this?

0 Likes

#2

Can you paste the code which have --map-column-hive and also output of describe command of sales_orders?

0 Likes

#3

Hello

I sorted it out myself. Actually, the issue was that the MySQL table (the table I was trying to import to HDFS) has a column named ‘default’ (which is a keyword in MySQL). Just because of name of the column import system was giving me error. So, what I did, I wrote a SQOOP import code with column names mentioned in it. Instead of just saying --table tablename I added --columns column1,column2,column3,Default also (I called all the column names and kept ‘Default’ instead of ‘default’ because ‘default’ was giving me error but not ‘Default’). So, that helped me to get the data in HDFS.

And the other issue was taken care by tinyInt1isBit=false.

Anyways, thank you for rescuing me. I am following ITVERSITY videos, it is INCREDIBLE.

1 Like