Filter and Aggregate on Join Results using SQL

Understanding how to combine SQL joins, filtering, and aggregation functions is crucial for extracting actionable insights from complex relational databases. The provided database schema includes several interconnected tables: departments, categories, products, customers, orders, and order_items. These tables represent a typical e-commerce or retail business model, capturing everything from inventory to customer transactions. Below are simplified yet comprehensive examples demonstrating how you can navigate and analyze this schema to derive meaningful business intelligence.

Example 1: Total Revenue Per Department

This query aims to calculate the total revenue generated by each department, demonstrating how products sold contribute to their respective departments’ financial performance.

SQL Query:

SELECT d.department_name,
       ROUND(SUM(oi.order_item_subtotal), 2) AS total_revenue
FROM departments d
JOIN categories c ON d.department_id = c.category_department_id
JOIN products p ON c.category_id = p.product_category_id
JOIN order_items oi ON p.product_id = oi.order_item_product_id
GROUP BY d.department_name
ORDER BY total_revenue DESC;

Query Breakdown:

  • FROM departments d: Initiates the query with the departments table.
  • JOIN categories c ON d.department_id = c.category_department_id: Links departments to their respective categories.
  • JOIN products p ON c.category_id = p.product_category_id: Further joins categories to their products.
  • JOIN order_items oi ON p.product_id = oi.order_item_product_id: Connects products to order items to access sales data.
  • GROUP BY d.department_name: Groups the results by department.
  • ROUND(SUM(oi.order_item_subtotal), 2): Calculates the total revenue for each department, rounded to two decimal places.

Example 2: Customer Order History

This query provides a summary of each customer’s order history, revealing the total number of orders placed and the total amount spent.

SQL Query:

SELECT c.customer_fname,
       COUNT(o.order_id) AS total_orders,
       ROUND(SUM(oi.order_item_subtotal), 2) AS total_spent
FROM customers c
JOIN orders o ON c.customer_id = o.order_customer_id
JOIN order_items oi ON o.order_id = oi.order_item_order_id
GROUP BY c.customer_id
ORDER BY total_spent DESC;

Query Breakdown:

  • Joins integrate customer information with their orders and the corresponding order items.
  • Aggregation functions count the total orders and sum the total spent for each customer.
  • Results are grouped by customer, showcasing individual spending and activity.

Example 3: Product Sales in Categories

The focus here is on assessing product performance within each category, identifying top-performing products and categories based on sales.

SQL Query:

SELECT cat.category_name,
       ROUND(SUM(oi.order_item_subtotal), 2) AS total_sales
FROM categories cat
JOIN products prod ON cat.category_id = prod.product_category_id
JOIN order_items oi ON prod.product_id = oi.order_item_product_id
GROUP BY cat.category_name, prod.product_name
ORDER BY cat.category_name, total_sales DESC;

Query Breakdown:

  • The query connects products to their categories and order items to assess sales figures.
  • Aggregated sales figures are calculated for each product within its category.
  • The output is sorted to highlight the most successful products and categories.

Through these examples, we see how combining SQL joins, filters, and aggregation functions can yield comprehensive insights into various aspects of business operations, from departmental revenues and customer behaviors to product performance. These analyses are vital for informed decision-making and strategic planning in any data-driven organization.

Keep in mind that regular practice is key to mastering SQL. Engage actively with SQL by experimenting across diverse scenarios, delve into various data sets, explore a range of functions and clauses, and embrace different challenges to deepen your understanding.

Furthermore, enhance your learning by diving into SQL, Python, Hadoop, Spark, and more with our cutting-edge Big Data Cluster. When you sign up for our labs, you not only gain access to state-of-the-art technology but also receive dedicated support from our team. This support is integral for navigating Data Engineering courses, helping you overcome any challenges and deepening your expertise in the field.