Getting started with Spark

Originally published at: http://www.itversity.com/topic/getting-started-with-spark/

As part of this topic we will see brief overview about spark While HDFS is distributed file system, spark is in memory distributed computing engine Spark is bunch of APIs which can be used in programming languages such as Scala, Python and Java We can use REPL or CLI to validate code in interactive fashion…