Getting java.lang.NullPointerException when saveAsTextFile

Hi Team,

import com.typesafe.config.{Config, ConfigFactory}
import org.apache.spark.{SparkConf, SparkContext}

object Factorial {
def main ( args : Array[String]): Unit = {
val executionEnviornment = args(0)
val props: Config = ConfigFactory.load()
val conf = new SparkConf().setAppName(“Hello”).setMaster(props.getConfig(executionEnviornment).getString(“executionMode”))
val sc = new SparkContext(conf)
val rdd = sc.textFile(args(1))
println(rdd.count())
val divideRdd = rdd.flatMap(x => x.split(" “))
val xx = divideRdd.map(x => (x.replace(”,",""),1))
val yy = xx.reduceByKey((a,b) => (a+b))
println(“Hello”)
yy.take(10).foreach(println) //Getting output till this point
yy.saveAsTextFile(args(2))
}
}

(dynamic,1)
( (for,1)
17/04/29 22:16:09 INFO SparkContext: Starting job: saveAsTextFile at Sample.scala:26
17/04/29 22:16:09 INFO MapOutputTrackerMaster: Size of output statuses for shuffle 0 is 143 bytes
17/04/29 22:16:09 INFO DAGScheduler: Got job 2 (saveAsTextFile at Sample.scala:26) with 1 output partitions
17/04/29 22:16:09 INFO DAGScheduler: Final stage: ResultStage 4 (saveAsTextFile at Sample.scala:26)
17/04/29 22:16:09 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 3)
17/04/29 22:16:09 INFO DAGScheduler: Missing parents: List()
17/04/29 22:16:09 INFO DAGScheduler: Submitting ResultStage 4 (MapPartitionsRDD[5] at saveAsTextFile at Sample.scala:26), which has no missing parents
17/04/29 22:16:09 INFO MemoryStore: Block broadcast_4 stored as values in memory (estimated size 48.7 KB, free 176.6 KB)
17/04/29 22:16:09 INFO MemoryStore: Block broadcast_4_piece0 stored as bytes in memory (estimated size 16.9 KB, free 193.6 KB)
17/04/29 22:16:09 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on localhost:64623 (size: 16.9 KB, free: 1809.7 MB)
17/04/29 22:16:09 INFO SparkContext: Created broadcast 4 from broadcast at DAGScheduler.scala:1006
17/04/29 22:16:09 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 4 (MapPartitionsRDD[5] at saveAsTextFile at Sample.scala:26)
17/04/29 22:16:09 INFO TaskSchedulerImpl: Adding task set 4.0 with 1 tasks
17/04/29 22:16:09 INFO TaskSetManager: Starting task 0.0 in stage 4.0 (TID 3, localhost, partition 0,NODE_LOCAL, 1894 bytes)
17/04/29 22:16:09 INFO Executor: Running task 0.0 in stage 4.0 (TID 3)
17/04/29 22:16:09 INFO deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
17/04/29 22:16:09 INFO deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
17/04/29 22:16:09 INFO deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
17/04/29 22:16:09 INFO deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
17/04/29 22:16:09 INFO ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks
17/04/29 22:16:09 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
17/04/29 22:16:09 ERROR Executor: Exception in task 0.0 in stage 4.0 (TID 3)
java.lang.NullPointerException
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:404)

Please revisit this line. Are you trying to tokenize the words which are previously flatMapped?. If so please do it

val xx = divideRdd.map(word=> (word,1))