Let us understand how we can load the data into databases using utilities provided.
- Most of the databases provide data loading utilities.
- One of the most common ways of getting data into database tables is by using data loading utilities provided by the underlying database technology.
- We can load delimited files into a database using these utilities.
- Here are the steps we can follow to load the delimited data into the table:
- Make sure files are available on the server from which we are trying to load.
- Ensure the database and table are created for the data to be loaded.
- Run the relevant command to load the data into the table.
- Make sure to validate by running queries.
Let us see a demo by loading a sample file into the table in Postgres database.
Loading Data
We can use COPY Command using psql
to copy the data into the table.
- Make sure the database is created along with the user with the right permissions. Also, the user who wants to use
COPY
command needs to have pg_read_server_files role assigned. - Connect to the Postgres Database on pg.itversity.com as the super user postgres and run the relevant grant command to grant the permissions.
docker-compose exec pg.itversity.com psql -U postgres
- Run this command to grant the required permissions to run the COPY Command to the user itversity_sms_user.
GRANT pg_read_server_files TO itversity_sms_user;
\q
- Connect to pg.itversity.com using bash and run
mkdir
command to create the required folder or directory structure.
docker-compose exec pg.itversity.com mkdir -p /data/sms_db
docker-compose exec pg.itversity.com ls -ltr /data
-
You can find users.csv file under data/sms_db as part of the repository directory data-engineering-spark. You just have to copy the file to pg.itversity.com under /data/sms_db. The file contains the following data.
-
Copy users.csv into pg.itversity.com.
docker cp data/sms_db/users.csv data-engineering-spark-pg.itversity.com-1:/data/sms_db
docker-compose exec pg.itversity.com ls -ltr /data/sms_db
- Connect to the Database.
docker-compose exec pg.itversity.com psql -U itversity_sms_user -d itversity_sms_db
- Create the
users
table.
CREATE TABLE users (
user_id SERIAL PRIMARY KEY,
user_first_name VARCHAR(30) NOT NULL,
user_last_name VARCHAR(30) NOT NULL,
user_email_id VARCHAR(50) NOT NULL,
user_email_validated BOOLEAN DEFAULT FALSE,
user_password VARCHAR(200),
user_role VARCHAR(1) NOT NULL DEFAULT 'U',
is_active BOOLEAN DEFAULT FALSE,
created_dt DATE DEFAULT CURRENT_DATE
);
- Use the COPY command to load the data
COPY users(user_first_name, user_last_name, user_email_id, user_role, created_dt)
FROM '/data/sms_db/users.csv'
DELIMITER ','
CSV HEADER;
- Validate by running queries
SELECT * FROM users;