Filter not working


#1

While still waiting for approval to the CCA 175 certification forum, I am posting this question here in Scala:

Following the instructor’s lecture, I have the following script:

val orderItemsList = Source.fromFile(“C://RXIE//Learning//Scala//data//retail_db//orders//part-00000”).getLines.toList

This works, and:
orderItemsList.size gives 68883 which is correct.

In the list data is like:
2,2013-07-25 00:00:00.0,256,PENDING_PAYMENT

I want to apply a filter like:
val order123 = orderItemsList.map(orderItem => orderItem.split(",")(2).toInt < 123).size
and
val order12345 = orderItemsList.map(orderItem => orderItem.split(",")(2).toInt > 12345).size

both return 68883, meaning the filters are not working.

Can someone enlighten me why the filters are not working?

Thank you very much.

Updated: sorry I was confused at the beginning, it should be “filter”, not “map”


#2

@paslechoix:

Not sure size is working for you in Scala but count works.
Here, are the RDDs , now what is your question after seeing these 4 RDDs?

scala> val order123 = orderItems.map(orderItem => orderItem.split(",")(2).toInt < 123).count()
order123: Long = 172198

scala> val order12345 = orderItems.map(orderItem => orderItem.split(",")(2).toInt > 12345).count()
order12345: Long = 172198

scala> val order123 = orderItems.filter(orderItem => orderItem.split(",")(2).toInt < 123).count()
order123: Long = 1508

scala> val order12345 = orderItems.filter(orderItem => orderItem.split(",")(2).toInt > 12345).count()
order12345: Long = 0

Thanks
Venkat


#3

Thank you, indeed after I posted my question, I realized immediately that I mistakenly used .map instead of .filter

I wanted to remove the post but don’t know how.

Sorry for consusing.


#4

to answer your question:

scala> val order123 = orderItems.filter(orderItem => orderItem.split(",")(2).toInt < 123).size
order123: Int = 670

Note: using count will error out