Architecture of Spark

#1

Originally published at: http://www.itversity.com/topic/architecture-of-spark-scala/

Let us see architecture of Spark. https://youtu.be/ZI-FQ0ORItw Spark is distributed computing engine It works on many file systems – typically distributed ones It uses HDFS APIs for reading files from file system Works seamlessly on HDFS, AWS S3 and Azure Blob etc Run a sample job Validate files in HDFS hadoop fs -ls /public/randomtextwriter/part-m-00000 Launch spark…

0 Likes