Resilient Distributed Datasets - from collections

Originally published at: http://www.itversity.com/topic/resilient-distributed-datasets-from-collections/

Let us convert Scala collection to RDD using parallelize labs.itversity.com We need to have SparkContext object to invoke parallelize When we launch spark-shell SparkContext object sc will be created automatically If we use sbt console or try to build application we need to create SparkConfig object as well as SparkContext object Create Array val data…