Data Ingest - real time, near real time and streaming analytics - Flume and Kafka in Streaming analytics

Flume and Kafka

  • The life cycle of streaming analytics

    • Get data from the source (Flume and/or Kafka)
    • Process data
    • Store it in target
  • Kafka can be used for most of the applications

  • But existing source applications need to be refactored to publish

  • Source applications are mission-critical and highly sensitive for any

  • In that case, if the messages are already captured in web server logs,
    one can use Flume to get messages from logs and publish to Kafka

Learn Spark 1.6.x or Spark 2.x on our state of the art big data labs

  • Click here for access to state of the art 13 node Hadoop and Spark Cluster