What's wrong with this Hive Query



Hi All,

I am trying to achieve below:
“Calculate total revenue earned from each customer along with revenue earned per product”. So the output will be (Customer, TotalRevenue, Product, RevenueFromProduct). I am using Analytical function ‘Over’ to achieve this. Below is the query I have written to achieve this result:


customers.customer_id Customer,
sum(order_items.order_item_subtotal) TotalRevenue,
order_items.order_item_product_id Product,
sum(order_items.order_item_subtotal) over(partition by order_items.order_item_product_id) RevenueFromProduct

from orders join order_items on orders.order_id = order_items.order_item_order_id join customers on orders.order_customer_id = customers.customer_id group by customer_id;

When I run it on my Hive cluster, it gives following error. Can someone please suggest what is wrong with the query?

FAILED: SemanticException Failed to breakup Windowing invocations into Groups. At least 1 group must only depend on input columns. Also check for circular dependencies.
Underlying error: org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:189 Expression not in GROUP BY key ‘order_item_product_id’

Practice hive on state of the art Big Data cluster - https://labs.itversity.com


Hi All - Can someone suggest what is wrong with the query?

Thanks in advance!