Save as sequencefile failing

While saving final output file after joining, is failing. Pls help how to save in sequence file.

Below is the final aggregation command.

val revenueOrderPerDay = rev.aggregateByKey((0.0,0))((acc,value) => (acc._1+value, acc._2+1), (acc,value) => (acc._1+value._1, acc._2+value._2))
revenueOrderPerDay: org.apache.spark.rdd.RDD[(String, (Double, Int))] = ShuffledRDD[12] at aggregateByKey at :43

#Following are failing commands:
revenueOrderPerDay.map(rec => (rec._1, rec._2)).saveAsSequenceFile("/user/nikkhiel123/scala/sequence")
:46: error: value saveAsSequenceFile is not a member of org.apache.spark.rdd.RDD[(String, (Double, Int))]

revenueOrderPerDay.saveAsSequenceFile("/user/nikkhiel123/scala/sequence")
:46: error: value saveAsSequenceFile is not a member of org.apache.spark.rdd.RDD[(String, (Double, Int))]
revenueOrderPerDay.saveAsSequenceFile("/user/nikkhiel123/scala/sequence")

revenueOrderPerDay.map(rec => (NullWritable.get(), rec)).saveAsSequenceFile("/user/nikkhiel123/scala/sequence")
:49: error: value saveAsSequenceFile is not a member of org.apache.spark.rdd.RDD[(org.apache.hadoop.io.NullWritable, (String, (Double, Int)))]

Instead of above transformation and action at same time. Try below:

revenueOrderPerDay.saveAsSequenceFile("/user/nikkhiel123/scala/sequence")

It should work.

Thanks Ravi,

I tried that also. But it failed too. Pls see second failed command in my original post.

@N_Chakote - Not sure why you are NOT able to saveAsSequenceFile, below is the full code which is working fine at end save as well.

orderRDD = sc.textFile("/user/gnanaprakasam/sqoop_import/orders")

orderParsed = orderRDD.map(lambda rec: (rec.split(",")[0], rec))

orderItemRDD = sc.textFile("/user/gnanaprakasam/sqoop_import/order_items")

orderItemParsed = orderItemRDD.map(lambda rec: (rec.split(",")[1], rec))

orderItemjoinorder = orderItemParsed.join(orderParsed)

orderdtidamt = orderItemjoinorder.map(lambda rec: ((rec[1][1].split(",")[1], rec[1][1].split(",")[0]), float(rec[1][0].split(",")[4])))

revenuereduceByKey = orderdtidamt.reduceByKey(lambda acc, value : acc + value)

revenuedateamt = revenuereduceByKey.map(lambda rec: (rec[0][0], rec[1]))

revenueaggregateByKey = revenuedateamt.aggregateByKey((0,0), lambda acc, value: (acc[0] + value, acc[1] + 1), lambda total1, total2: (round((total1[0] + total2[0]), 2), total1[1] + total2[1]))

revenuePerDay = revenueaggregateByKey.map(lambda rec: (rec[0], round((rec[1][0] / rec[1][1]), 2)))

revenuePerDay.saveAsSequenceFile("/user/gnanaprakasam/pyspark/revenuePerDaySeq")

@gnanaprakasam
Thank you,
The above code is working for me also while saving average revenue. But if I want to save the RDD revenueaggregateByKey, (it is returning org.apache.spark.rdd.RDD[(String, (Double, Int))]) by using
revenueaggregateByKey.saveAsSequenceFile is faiing. Can you pls check.

Pls help.

@N_Chakote - I am able to save revenueaggregateByKey.saveAsSequenceFile.

Do you have full code ? Did you see any difference in yours ?

Thanks, @gnanaprakasam
I tried my code and also I copy pasted Durga sir’s. It is happening same. I will try again today and get back. In your case, did you launch Spark with yarn or local. And did it save as [Text,ArrayWritable].

@N_Chakote - For scala I am able to save Text File, Not able to save as sequence File.

But for python I could able to save both.