Value reduceByKey is not a member of org.apache.spark.rdd.RDD[String]

apache-spark

#1

Dataset
MU,1
MU,6
BH,6
MU,5
KR,6
BH,1
KR,1
DK,6
MK,5
CK,7
DK,1
MK,2
CK,1

Code Used :slight_smile:
var data = sc.textFile(“dataset/cityData.txt”)
var data2 = data.reduceByKey(x,y=>x+y)

Output
value reduceByKey is not a member of org.apache.spark.rdd.RDD[String]

What is the solution for this issue ? Any Jars missing while executing the code ?


#2

It has to be Rdd of key value pair for reduceByKey to work. Currently its RDD of string.


#3

Yes True. I have converted the data into (K,V) pair using the code ar

var newData = data.map(x=>{var d=x.split(",");(d(0),d(1))})

After this reduceByKey worked fine :slight_smile:


#4