Can we do multiple aggregations in one go on the same RDD
something similar to this sql below :
SELECT ORDER_ID,COUNT(ORDER_ITEM_ID),MIN(ORDER_ITEM_ID_SUBTOTAL),MAX(ORDER_ITEM_ID_SUBTOTAL),AVG(ORDER_ITEM_PRICE)
FROM ORDER_ITEMS GROUP BY ORDER_ID
The examples that we see in the aggregateByKey(), is doing two aggregations SUM and Count, but I would like some guidance if we can do multiple aggregations based on same key, can you please kindly explain with a full example.
Kind Regards,
Lakshminarayanan
Learn Spark 1.6.x or Spark 2.x on our state of the art big data labs
- Click here for access to state of the art 13 node Hadoop and Spark Cluster