Performing basic aggregations using GROUP BY


Let us get an overview of GROUP BY and perform basic aggregations such as SUM, MIN, MAX etc using GROUP BY…

  • It is primarily used for performing aggregate type of operations based on a key.
  • When GROUP BY is used, SELECT clause can only have those columns/expressions specified in GROUP BY clause and then aggregate functions.
  • We need to have key defined in GROUP BY Clause.
  • Same key should be specified in SELECT clause along with aggregate function.

Let us see aggregations using GROUP BY in action.

SELECT * FROM order_items LIMIT 10;

SELECT order_item_order_id,
  sum(order_item_subtotal) AS order_revenue,
  min(order_item_subtotal) AS min_order_item_subtotal,
  max(order_item_subtotal) AS max_order_item_subtotal,
  avg(order_item_subtotal) AS avg_order_item_subtotal,
  count(order_item_subtotal) AS cnt_order_item_subtotal
FROM order_items
GROUP BY order_item_order_id;

Practice hive on state of the art Big Data cluster -
You can sign up for our courses on Udemy using $10 coupons - Udemy Coupons - Big Data Courses