Step 03 - Getting Started with Spark

This is 3rd bi-weekly plan to prepare for HDPCD Spark

  • Getting started with Spark
  • Develop word count program
  • Execute it on the cluster
  • Understanding HDFS briefly
  • Develop Spark program to compute daily revenue
  • Following are different transformations and actions covered
  • map, flatMap, filter
  • groupByKey, reduceByKey, aggregateByKey
  • actions - collect, take
  • Programs are developed using Intellij and built using sbt

Here is the playlist for reference (videos 4th to 9th)