Programming Essentials Python - Overview of Pandas Libraries - Joining Data Frames

Let us understand how to join Data Frames using Pandas.

Explanation for the video

Put a place holder for the video here with text so that I can replace as part of the automation

Key Concepts Explanation

In this section, we will break down the key concepts related to joining Data Frames using Pandas.

Key Concept 1

To join Data Frames, we can use the .join() method by setting the index of the Data Frames with the respective columns to be joined on. Here is an example:

orders.set_index('order_id').join(order_items.set_index('order_item_order_id'))

Key Concept 2

By specifying the join type as inner, we can merge the Data Frames based on the common index values. Here is an example:

orders.set_index('order_id').join(order_items.set_index('order_item_order_id'), how='inner')

Hands-On Tasks

In this section, we will provide practical tasks for you to apply the concepts learned.

Task 1

Compute Daily Revenue by joining orders and order_items, considering only COMPLETE and CLOSED orders.

orders_considered = orders.query("order_status in ('COMPLETE', 'CLOSED')")
orders_considered.set_index('order_id').join(order_items.set_index('order_item_order_id'), how='inner').groupby('order_date')['order_item_subtotal'].agg(['sum']).rename(columns={'sum': 'revenue'})

Task 2

Find orders with no corresponding order items by performing a left join.

orders.set_index('order_id').join(order_items.set_index('order_item_order_id')).query('order_item_id.isna()')

Task 3

Compute Daily Product Revenue using orders.order_date and order_items.order_item_product_id, considering only COMPLETE and CLOSED orders.

orders_considered = orders.query("order_status in ('COMPLETE', 'CLOSED')")
orders_considered.set_index('order_id').join(order_items.set_index('order_item_order_id'), how='inner').groupby(['order_date', 'order_item_product_id'])['order_item_subtotal'].agg(['sum']).rename(columns={'sum': 'revenue'}).reset_index()

Conclusion

In this article, we have covered the basics of joining Data Frames using Pandas. We encourage you to practice these tasks and engage with the community for further learning.

Watch the video tutorial here