pairedRDD from wholeTextfiles

val wholefiles = sc.wholeTextFiles(“sqoop_import/orders”)
wholefiles: org.apache.spark.rdd.RDD[(String, String)] = sqoop_import/orders MapPartitionsRDD[72] at wholeTextFiles at :27

wholefiles.take(5).foreach(println)
(hdfs://filename, 1, 2013-07-25 00:00:00.0,11599,CLOSED
2,2013-07-25 00:00:00.0,256,PENDING_PAYMENT
3,2013-07-25 00:00:00.0,12111,COMPLETE
4,2013-07-25 00:00:00.0,8827,CLOSED)

How can I get a pairedRDD with column4 and column1 from the above data?

Each record will have file name as key and data in the entire file as value. Value is very big string and you need to process using string based APIs.