CCA 175 question about sortBy


#1

Hello,

I am practicing the following and don’t get it, can anyone help to sort it out? Thank you very much.

val products = sc.textFile("/public/retail_db/products")
val productsMap2 = products.
filter(product => product.split(",")(4) != “”).
map(product => (product, product.split(",")(4).toFloat))

Now, if I run:
productsMap2.take(10).foreach(println)
I am able to see the data as below:

(1,2,Quest Q64 10 FT. x 10 FT. Slant Leg Instant U,59.98,http://images.acmesports.sports/Quest+Q64+10+FT.+x+10+FT.+Slant+Leg+Instant+Up+Canopy,59.98)
(2,2,Under Armour Men’s Highlight MC Football Clea,129.99,http://images.acmesports.sports/Under+Armour+Men’s+Highlight+MC+Football+Cleat,129.99)
(3,2,Under Armour Men’s Renegade D Mid Football Cl,89.99,http://images.acmesports.sports/Under+Armour+Men’s+Renegade+D+Mid+Football+Cleat,89.99)
(4,2,Under Armour Men’s Renegade D Mid Football Cl,89.99,http://images.acmesports.sports/Under+Armour+Men’s+Renegade+D+Mid+Football+Cleat,89.99)
(5,2,Riddell Youth Revolution Speed Custom Footbal,199.99,http://images.acmesports.sports/Riddell+Youth+Revolution+Speed+Custom+Football+Helmet,199.99)
(6,2,Jordan Men’s VI Retro TD Football Cleat,134.99,http://images.acmesports.sports/Jordan+Men’s+VI+Retro+TD+Football+Cleat,134.99)
(7,2,Schutt Youth Recruit Hybrid Custom Football H,99.99,http://images.acmesports.sports/Schutt+Youth+Recruit+Hybrid+Custom+Football+Helmet+2014,99.99)
(8,2,Nike Men’s Vapor Carbon Elite TD Football Cle,129.99,http://images.acmesports.sports/Nike+Men’s+Vapor+Carbon+Elite+TD+Football+Cleat,129.99)
(9,2,Nike Adult Vapor Jet 3.0 Receiver Gloves,50.0,http://images.acmesports.sports/Nike+Adult+Vapor+Jet+3.0+Receiver+Gloves,50.0)
(10,2,Under Armour Men’s Highlight MC Football Clea,129.99,http://images.acmesports.sports/Under+Armour+Men’s+Highlight+MC+Football+Cleat,129.99)

I want to sort it by the 5th column (which is the price) in the following way and it failed:

scala> productsMap2.sortBy(_._4, false).take(10).foreach(println)
:32: error: value 4 is not a member of (String, Float)
productsMap2.sortBy(
._4, false).take(10).foreach(println)

Can anyone help to fix it? Thank you very much.


#2

productsMap2 is RDD[(String, Float)] So while printing you can’t access _4 element as it doesn’t exists. You can either sort by your String which is first element _1 or Float which is second element _2 of your RDD. Change your print statement to following

productsMap2.sortBy(_._2, false).take(10).foreach(println)


#3

productsMap2.sortBy(_._4.toFloat, false).take(10).foreach(println)

try to convert the price to Float using toFloat like above.

Thanks


#4

Thank you, now I understand there are actually only two elements in the RDD, first one is product and the second one is price, _._2 is working as expected.


#5