Data Ingest - real time, near real time and streaming analytics - Spark Streaming – Getting Started

Let us get started with Spark Streaming.

  • Spark Streaming is a module which provide APIs to process streaming data
  • It can be integrated with any streaming ingestion technology such as Flume, Kafka etc

Getting Started

Spark Streaming is used to process data in streaming fashion.

  • It requires web service called StreamingContext
  • Unlike SparkContext, StreamingContext runs perpetually processing data at regular intervals
  • We cannot have multiple contexts running at same time, hence if there is running SparkContext we need to stop it before we launch StreamingContext

Learn Spark 1.6.x or Spark 2.x on our state of the art big data labs

  • Click here for access to state of the art 13 node Hadoop and Spark Cluster