This is 3rd bi-weekly plan to prepare for HDPCD Spark
- Getting started with Spark
- Develop word count program
- Execute it on the cluster
- Understanding HDFS briefly
- Develop Spark program to compute daily revenue
- Following are different transformations and actions covered
- map, flatMap, filter
- groupByKey, reduceByKey, aggregateByKey
- actions - collect, take
- Programs are developed using Intellij and built using sbt
Here is the playlist for reference (videos 4th to 9th)