Exercise 30 - Develop oozie workflow to compute daily revenue

Description

  • Get data from mysql database, join and aggregate the data, then load output to mysql database

Problem Statement

  • Use Oozie to define workflow where
    • Data for orders and order_items is retrieved using sqoop in parallel
    • Create hive external tables for orders and order_items (use your own hive database)
    • Create a new table which will have data to get daily revenue (join orders and order_items and then aggregate order_item_subtotal from order_items for each day from orders.order_date)
    • Export results back to a table in retail_export (you have to create your own table in retail_export mysql database)
    • Use the below resources to get started and then add more actions to join the data and then export. Update hive table script to create new table which will have aggregated data. Create new action for sqoop export.

Resources:

Additional information

  • Make sure jdbc jar file is copied to sqoop libraries of oozie and then run sharelibupdate
sudo -u oozie hadoop fs -put /usr/share/java/mysql-connector-java.jar /user/oozie/share/lib/lib_20161114194124/sqoop
oozie.use.system.libpath=true
sudo -u oozie oozie admin -oozie http://nn01.itversity.com:11000/oozie -sharelibupdate
1 Like