Transform, Stage and Store - Spark with Python


Originally published at:

As part of this lesson, we will see how we can perform transform, stage and store using Spark’s core transformations and actions Creating Resilient Distributed Datasets (RDDs) Row level transformations – map, filter, flatMap Joining Data Sets Performing Aggregations Sorting data Assigning ranks Set operations Writing RDDs back to HDFS