Data Ingest - real time, near real time and streaming analytics - Flume and Spark Streaming – Department Wise Traffic – Develop application

Integration – Flume and Spark Streaming

  • Read data from /opt/gen_logs/logs/access.log using flume
  • Write unprocessed data as well as streaming department count data
    to HDFS
  • Development
  • Create new program
  • Run and validate
    • Ship it to the cluster
    • Run flume agent
    • Run spark submit with the python program
    • Validate whether files are being generated or not

Learn Spark 1.6.x or Spark 2.x on our state of the art big data labs

  • Click here for access to state of the art 13 node Hadoop and Spark Cluster