RDD sortBy not returning sorted result?


I am following the CCA175 lectures and here is my script:

100,2013-07-25 00:00:00.0,12131,PROCESSING
101,2013-07-25 00:00:00.0,5116,CLOSED
102,2013-07-25 00:00:00.0,8027,COMPLETE
103,2013-07-25 00:00:00.0,12256,PROCESSING
104,2013-07-25 00:00:00.0,7790,PENDING_PAYMENT
105,2013-07-26 00:00:00.0,8220,COMPLETE
106,2013-07-26 00:00:00.0,395,PROCESSING
107,2013-07-26 00:00:00.0,1845,COMPLETE
108,2013-07-26 00:00:00.0,12149,PROCESSING
109,2013-07-26 00:00:00.0,9345,PENDING_PAYMENT

_(3) is for the last column orderStatus, with value of PROCESSING, COMPLETE, PENDING_PAYMENT

As you can see the result is NOT sorted.

Am I missing anything here?

Thank you very much.


I’m not sure why this one will not work,my guess is sortBy expects a mapping unless key is already mentioned in RDD i.e it didn’t understand which field to sort here(I may be wrong).

Try this,it will work .We are telling that the field as the one after 3rd comma below

val orders_sorted = orders.sortBy(rec = > (rec.split(",")(3)),ascending = true).take(10).foreach(println)


Thanks it works, just minor change to remove the space between = > to =>