Please suggest code is correct for the mentioned scenario

apache-spark

#1

Problem Scenario : You have been given two files
file1 .txt
spark16/file2.txt
1,9,5
2,7,4
3,8,3
file2.txt
1,g,h
2,I,j
3,k,l
Load these two files as Spark RDD and join them to produce the below results
(1, ( (9,5), (g,h) ))
(2, ( (7,4), (i,j) ))
(3, ( (8,3), (k,l) ))
And write code snippet which will sum the second columns of above joined results (5+4+3).

Solution (code):

val file1 = sc.textFile("/user/vanimai86/sparkexercise/file1.txt").
map(r => (r.split(",")(0).toInt,r.split(",")(1).toInt,r.split(",")(2).toInt)).
map( r =>( r._1 ,(r._2,r._3)))

val file2 = sc.textFile("/user/vanimai86/sparkexercise/file2.txt").
map(r => (r.split(",")(0).toInt,r.split(",")(1),r.split(",")(2))).
map( r =>( r._1 ,(r._2,r._3)))

val outputjoin = file1.join(file2)
val outputjoinmap = outputjoin.map ( r => (r._2.1.2)).reduce( +)


#2

typo error in reduce
val outputjoinmap = outputjoin.map ( r => (r.2.1.2)).reduce( +_)


#3