aggregateByKey error


#1

Hi Team,

I am executing below code :
data = sc.parallelize( [(2, (2,‘1’)), (1, (4,‘2’)), (2, (5,‘3’)), (1, (10,‘2’)), (1, (20,‘7’))] )
data = sc.parallelize( [(0, 2), (0,4), (1,5), (1, 10), (1,20)])
dataagg=data.aggregateByKey((0,0),(lambda x,y:(x[0]+y,x[1] if (x[1]>x) else x)),(lambda x,y : (x[0]+y[0],x[1] if (x[1]>y[1]) else y[1])))

I am expecting result like :slight_smile:
(0,6)
(1,35)

Instead, I am getting result like below:

(0, (6, (2, (0, 0))))
(1, (35, (15, (5, (0, 0)))))

Please let me know where is the issue.
Thanks in advance.

Learn Spark 1.6.x or Spark 2.x on our state of the art big data labs

  • Click here for access to state of the art 13 node Hadoop and Spark Cluster