Python - filter condition

Did I miss anything in the following query? I am not getting greater than 10000 orders.

ordersmap = ordersRDD.filter(lambda x: int(x.split(",")[2] > ‘10000’) and x.split(",")[3]==‘COMPLETE’)
output:
1,2016-07-29 23:40:47.0,11599,COMPLETE
3,2013-07-25 00:00:00.0,12111,COMPLETE
5,2013-07-25 00:00:00.0,11318,COMPLETE
6,2013-07-25 00:00:00.0,7130,COMPLETE
7,2013-07-25 00:00:00.0,4530,COMPLETE

try this

ordersmap = ordersRDD.filter(lambda x: int(x.split(",")[2]) > 10000 and x.split(",")[3]==‘COMPLETE’)

1 Like

@kumarb @RaghavendraKumars
In Orders table order_id is first field , you have to use [0] to point the first field.

int(x.split(",")[0]) > 10000)

ordersmap = ordersRDD.filter(lambda x: int(x.split(",")[2] == ‘11599’) and x.split(",")[3]==‘COMPLETE’)

1,2016-07-29 23:40:47.0,11599,COMPLETE
11397,2013-10-03 00:00:00.0,11599,COMPLETE
23908,2013-12-20 00:00:00.0,11599,COMPLETE

But greater than is not working, tried like the following:
ordersmap = ordersRDD.filter(lambda x: int(x.split(",")[2] > ‘11598’) and x.split(",")[3]==‘COMPLETE’)

Try this:
ordersmap = ordersRDD.filter(lambda x: int(x.split(",")[2]) > 11598 and x.split(",")[3]==‘COMPLETE’)

You are using single quotes while comparing integers and the integer cast parantheses are not close at correct location.

1 Like

Thank you…Its working now.