Sorting and Ranking using pyspark - by key

Originally published at: http://www.itversity.com/topic/sorting-and-ranking-using-pyspark-by-key/

Introduction to sorting and ranking by key Sorting can be broadly categorized into global and by key. As part of this topic we will covering sorting – by key. Load data from HDFS and store results back to HDFS using Spark Join disparate datasets together using Spark Calculate aggregate statistics (e.g., average or sum) using…