Executes DAG but doesn't display results


#1

When I write the code it is executed and creates a DAG but when an action is performed it doesnt give any results

scala> val path="/public/retail_db"
path: String = /public/retail_db

scala> val products = sc.textFile(path + “/products”)
17/11/21 12:12:51 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 336.4 KB, free 336.4 KB)
17/11/21 12:12:51 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 28.3 KB, free 364.8 KB)
17/11/21 12:12:51 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 172.16.1.109:56376 (size: 28.3 KB, free: 511.1 MB)
17/11/21 12:12:51 INFO SparkContext: Created broadcast 0 from textFile at :29
products: org.apache.spark.rdd.RDD[String] = /public/retail_db/products MapPartitionsRDD[1] at textFile at :29

scala> val productsGroupByCategory = products.
| filter(rec=>rec.split(",")(4)==" “).
| map(products=>(products.split(”,")(1).toInt,products)).groupByKey
17/11/21 12:13:14 INFO FileInputFormat: Total input paths to process : 1
productsGroupByCategory: org.apache.spark.rdd.RDD[(Int, Iterable[String])] = ShuffledRDD[4] at groupByKey at :33

scala> productsGroupByCategory.take(1).foreach(println)
17/11/21 12:13:36 INFO SparkContext: Starting job: take at :34
17/11/21 12:13:36 INFO DAGScheduler: Registering RDD 3 (map at :33)
17/11/21 12:13:36 INFO DAGScheduler: Got job 0 (take at :34) with 1 output partitions
17/11/21 12:13:36 INFO DAGScheduler: Final stage: ResultStage 1 (take at :34)
17/11/21 12:13:36 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 0)
17/11/21 12:13:36 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 0)
17/11/21 12:13:36 INFO DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[3] at map at :33), which has no missing parents
17/11/21 12:13:36 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 5.1 KB, free 369.9 KB)
17/11/21 12:13:36 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.7 KB, free 372.6 KB)
17/11/21 12:13:36 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 172.16.1.109:56376 (size: 2.7 KB, free: 511.1 MB)
17/11/21 12:13:36 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1008
17/11/21 12:13:36 INFO DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[3] at map at :33)
17/11/21 12:13:36 INFO YarnScheduler: Adding task set 0.0 with 2 tasks
17/11/21 12:13:36 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, wn03.itversity.com, partition 0,NODE_LOCAL, 2158 bytes)
17/11/21 12:13:36 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, wn02.itversity.com, partition 1,NODE_LOCAL, 2158 bytes)
17/11/21 12:13:36 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on wn03.itversity.com:53734 (size: 2.7 KB, free: 511.1 MB)
17/11/21 12:13:36 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on wn02.itversity.com:50451 (size: 2.7 KB, free: 511.1 MB)
17/11/21 12:13:36 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on wn03.itversity.com:53734 (size: 28.3 KB, free: 511.1 MB)
17/11/21 12:13:36 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on wn02.itversity.com:50451 (size: 28.3 KB, free: 511.1 MB)
17/11/21 12:13:37 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 1100 ms on wn03.itversity.com (1/2)
17/11/21 12:13:37 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 1152 ms on wn02.itversity.com (2/2)
17/11/21 12:13:37 INFO DAGScheduler: ShuffleMapStage 0 (map at :33) finished in 1.173 s
17/11/21 12:13:37 INFO YarnScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool
17/11/21 12:13:37 INFO DAGScheduler: looking for newly runnable stages
17/11/21 12:13:37 INFO DAGScheduler: running: Set()
17/11/21 12:13:37 INFO DAGScheduler: waiting: Set(ResultStage 1)
17/11/21 12:13:37 INFO DAGScheduler: failed: Set()
17/11/21 12:13:37 INFO DAGScheduler: Submitting ResultStage 1 (ShuffledRDD[4] at groupByKey at :33), which has no missing parents
17/11/21 12:13:37 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 5.9 KB, free 378.4 KB)
17/11/21 12:13:37 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 3.0 KB, free 381.4 KB)
17/11/21 12:13:37 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on 172.16.1.109:56376 (size: 3.0 KB, free: 511.1 MB)
17/11/21 12:13:37 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1008
17/11/21 12:13:37 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 1 (ShuffledRDD[4] at groupByKey at :33)
17/11/21 12:13:37 INFO YarnScheduler: Adding task set 1.0 with 1 tasks
17/11/21 12:13:37 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 2, wn03.itversity.com, partition 0,PROCESS_LOCAL, 1894 bytes)
17/11/21 12:13:37 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on wn03.itversity.com:53734 (size: 3.0 KB, free: 511.1 MB)
17/11/21 12:13:37 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to wn03.itversity.com:37336
17/11/21 12:13:37 INFO MapOutputTrackerMaster: Size of output statuses for shuffle 0 is 165 bytes
17/11/21 12:13:37 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 2) in 69 ms on wn03.itversity.com (1/1)
17/11/21 12:13:37 INFO YarnScheduler: Removed TaskSet 1.0, whose tasks have all completed, from pool
17/11/21 12:13:37 INFO DAGScheduler: ResultStage 1 (take at :34) finished in 0.070 s
17/11/21 12:13:37 INFO DAGScheduler: Job 0 finished: take at :34, took 1.338600 s
17/11/21 12:13:37 INFO SparkContext: Starting job: take at :34
17/11/21 12:13:37 INFO DAGScheduler: Got job 1 (take at :34) with 1 output partitions
17/11/21 12:13:37 INFO DAGScheduler: Final stage: ResultStage 3 (take at :34)
17/11/21 12:13:37 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 2)
17/11/21 12:13:37 INFO DAGScheduler: Missing parents: List()
17/11/21 12:13:37 INFO DAGScheduler: Submitting ResultStage 3 (ShuffledRDD[4] at groupByKey at :33), which has no missing parents
17/11/21 12:13:37 INFO MemoryStore: Block broadcast_3 stored as values in memory (estimated size 5.9 KB, free 387.3 KB)
17/11/21 12:13:37 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 3.0 KB, free 390.3 KB)
17/11/21 12:13:37 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on 172.16.1.109:56376 (size: 3.0 KB, free: 511.1 MB)
17/11/21 12:13:37 INFO SparkContext: Created broadcast 3 from broadcast at DAGScheduler.scala:1008
17/11/21 12:13:37 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 3 (ShuffledRDD[4] at groupByKey at :33)
17/11/21 12:13:37 INFO YarnScheduler: Adding task set 3.0 with 1 tasks
17/11/21 12:13:37 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID 3, wn02.itversity.com, partition 1,PROCESS_LOCAL, 1894 bytes)
17/11/21 12:13:37 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on wn02.itversity.com:50451 (size: 3.0 KB, free: 511.1 MB)
17/11/21 12:13:37 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to wn02.itversity.com:49732
17/11/21 12:13:37 INFO TaskSetManager: Finished task 0.0 in stage 3.0 (TID 3) in 53 ms on wn02.itversity.com (1/1)
17/11/21 12:13:37 INFO YarnScheduler: Removed TaskSet 3.0, whose tasks have all completed, from pool
17/11/21 12:13:37 INFO DAGScheduler: ResultStage 3 (take at :34) finished in 0.054 s
17/11/21 12:13:37 INFO DAGScheduler: Job 1 finished: take at :34, took 0.059624 s