Value join is not a member of Iterator error while doing a join


#1

Hi Team,
I am trying to execute below code.

val orderTuple = orderList.map(l => (l.split(",")(0).toInt, l.split(",")(2)))
orderTuple.take(5).foreach(println)
val orderItemsTuple = orderItemList.map(l => (l.split(",")(1).toInt, l.split(",")(4)))
orderItemsTuple.take(5).foreach(println)
val orderJoinTuple = {
  orderTuple.join(orderItemsTuple)
}

It is giving me this error
Error:(20, 18) value join is not a member of Iterator[(Int, String)]
orderTuple.join(orderItemsTuple)

Thanks of the help in advance!

Learn Spark 1.6.x or Spark 2.x on our state of the art big data labs

  • Click here for access to state of the art 13 node Hadoop and Spark Cluster


#2

join() operation is only available on PairedRDD. Both the RDDs should be in the form of a Key-Value pair. Try below sample code to join two RDDs

val orders = sc.textFile("/public/retail_db/orders")
val orderItems = sc.textFile("/public/retail_db/order_items")
val ordersMap = orders.map(order => {
  (order.split(",")(0).toInt, order.split(",")(1).substring(0, 10))
})
val orderItemsMap = orderItems.map(orderItem => {
  val oi = orderItem.split(",")
  (oi(1).toInt, oi(4).toFloat)
})
val ordersJoin = ordersMap.join(orderItemsMap)