sparkSQL Data frames LAST_VALUE windowing Functions solution


#1

found the solution

we need to do the following modification in

last_value(oi.order_item_subtotal) over (partition by o.order_id order by oi.order_item_subtotal desc rows between unbounded preceding and unbounded following) lv_revenue

with this i got for all five records 59.99 as last value as expected

my complete query with screenshot of output attached

select * from (select o.order_id,o.order_date,o.order_status,oi.order_item_subtotal,

round(sum(oi.order_item_subtotal) over (partition by o.order_id),2) order_revenue,

round(avg(oi.order_item_subtotal) over (partition by o.order_id),2) avg_revenue,

round(oi.order_item_subtotal/sum(oi.order_item_subtotal) over (partition by o.order_id),2) per_revenue,

lead(oi.order_item_subtotal) over (partition by o.order_id order by oi.order_item_subtotal desc) lead_revenue,

lag(oi.order_item_subtotal) over (partition by o.order_id order by oi.order_item_subtotal desc) lag_revenue,

first_value(oi.order_item_subtotal) over (partition by o.order_id order by oi.order_item_subtotal desc) fv_revenue,

last_value(oi.order_item_subtotal) over (partition by o.order_id order by oi.order_item_subtotal desc rows between unbounded preceding and unbounded following) lv_revenue

from orders o join order_items oi

on o.order_id = oi.order_item_order_id

where o.order_status in (‘COMPLETE’,‘CLOSED’)) q

order by order_date,order_revenue desc;

following Udemy COURSE cca-175-spark-and-hadoop-developer-certification-scala.
It’s ENGAGING,LEARNED ALOT THANK YOU Durga Sir!

Learn Spark 1.6.x or Spark 2.x on our state of the art big data labs

  • Click here for access to state of the art 13 node Hadoop and Spark Cluster