Sorting and Ranking using pyspark - global

Originally published at: http://www.itversity.com/topic/sorting-and-ranking-using-pyspark-global/

Introduction to sorting and ranking Sorting can be broadly categorized into global and by key. As part of this topic we will covering sorting – global. Load data from HDFS and store results back to HDFS using Spark Join disparate datasets together using Spark Calculate aggregate statistics (e.g., average or sum) using Spark Filter data…