reduceByKey - what if there is only one row in the RDD pertaining to a key?


#1

Can you please explain how reduceByKey works in case there is only 1 row in the RDD corresponding to a particular key. For eg, in the dataset below, key=1 has only one record.
Lambda function usually takes 2 arguments x and y, or two rows rather.
It does get processed though. So, where is this logic implicitly handled?

(1, u’1,1,957,1,299.98,299.98’)
(2, u’2,2,1073,1,199.99,199.99’)
(2, u’3,2,502,5,250.0,50.0’)
(2, u’4,2,403,1,129.99,129.99’)
(4, u’5,4,897,2,49.98,24.99’)
(4, u’6,4,365,5,299.95,59.99’)
(4, u’7,4,502,3,150.0,50.0’)
(4, u’8,4,1014,4,199.92,49.98’)
(5, u’9,5,957,1,299.98,299.98’)
(5, u’10,5,365,5,299.95,59.99’)


#2

If there is only one record, it will just return that record as part of reduceByKey, it will not do any processing.