Transform, Stage and Store - Spark with Python


#1

Originally published at: https://kaizen.itversity.com/lessons/transform-stage-and-store-spark-with-python/

As part of this lesson, we will see how we can perform transform, stage and store using Spark’s core transformations and actions Creating Resilient Distributed Datasets (RDDs) Row level transformations – map, filter, flatMap Joining Data Sets Performing Aggregations Sorting data Assigning ranks Set operations Writing RDDs back to HDFS