RDD and Collection


In the code below, we are passing the RDD productsGroupByCategoryId to the function: getTopNPricedProductsPerCategoryId as the parameter: productsPerCategoryId.

So, how are we able to sort this RDD with indexing [1] as when I try to execute the code without defining the function:getTopNPricedProductsPerCategoryId, it gives me error saying that RDD does NOT allow indexing.

def getTopNPricedProductsPerCategoryId(productsPerCategoryId, topN):
productsSorted = sorted(productsPerCategoryId[1],
key=lambda k: float(k.split(",")[4]),
productPrices = map(lambda p: float(p.split(",")[4]), productsSorted)
topNPrices = sorted(set(productPrices), reverse=True)[:topN]
import itertools as it
return it.takewhile(lambda p:
float(p.split(",")[4]) in topNPrices,

list(getTopNPricedProductsPerCategoryId(t, 3))

topNPricedProducts = productsGroupByCategoryId.
flatMap(lambda p: getTopNPricedProductsPerCategoryId(p, 3))