Database nulls in sqoop

sqoop

#1

Instructions
Export h1b data from hdfs to MySQL Database

Data Description
h1b data with ascii character “\001” as delimiter is available in HDFS

h1b data information:

HDFS Location: /public/h1b/h1b_data_to_be_exported
Fields:
ID, CASE_STATUS, EMPLOYER_NAME, SOC_NAME, JOB_TITLE, FULL_TIME_POSITION, PREVAILING_WAGE, YEAR, WORKSITE, LONGITUDE, LATITUDE
Number of records: 3002373
Output Requirements
Export data to MySQL Database
MySQL database is running on ms.itversity.com
User: h1b_user
Password: itversity
Database Name: h1b_export
Table Name: h1b_data_whoami
Nulls are represented as: NA
After export nulls should not be stored as NA in database. It should be represented as database null
Create table command:

CREATE TABLE h1b_data_jkmasurkar (
ID INT,
CASE_STATUS VARCHAR(50),
EMPLOYER_NAME VARCHAR(100),
SOC_NAME VARCHAR(100),
JOB_TITLE VARCHAR(100),
FULL_TIME_POSITION VARCHAR(50),
PREVAILING_WAGE FLOAT,
YEAR INT,
WORKSITE VARCHAR(50),
LONGITUDE VARCHAR(50),
LATITUDE VARCHAR(50));

Replace whoami with your OS user name
Above create table command can be run using
Login using mysql -u h1b_user -h ms.itversity.com -p
When prompted enter password itversity
Switch to database using use h1b_export
Run above create table command by replacing whoami with your OS user name
End of Problem

sqoop export
–connect jdbc:mysql://ms.itversity.com:3306/h1b_export
–username h1b_user
–password itversity
–export-dir /public/h1b/h1b_data_to_be_exported
–table h1b_data_jkmasurkar
–input-fields-terminated-by “\001”
–input-null-string “NULL”

I did not understand below part of the question

After export nulls should not be stored as NA in database. It should be represented as database null

Please help


#2

During Sqoop import NULL strings in MySQL were represented as ‘NA’. Now while moving data back to MySQL you will need to put

–input-null-string “NA”

so that all NA strings are converted to NULL while moving data back to MySQL


#3

I tried it but it doesn’t seem to be working,

please review my solution

sqoop export
–connect jdbc:mysql://ms.itversity.com:3306/h1b_export
–username h1b_user
–password itversity
–table h1b_data_jkmasurkar2
–null-string “NA”
–null-non-string “NA”
–export-dir /public/h1b/h1b_data_to_be_exported
–fields-terminated-by “\001” \


#4

@jkmasurkar Try below command.

sqoop export
–connect jdbc:mysql://ms.itversity.com:3306/h1b_export
–username h1b_user
–password itversity
–table h1b_data_
–export-dir /public/h1b/h1b_data_to_be_exported
–null-string “NA”
–null-non-string “NA”
–fields-terminated-by “\001”
–num-mappers 2
–batch
–outdir java_files


#5

Hello @jkmasurkar - Did solution provided by @annapurna work ?