I am trying to get flume data to Spark sink and use it to get department count for gen_logs.
As mentioned in the tutorial, we need to mention jars in spark submit command.
I was using below command:
spark-submit --master yarn \ --conf spark.ui.port=12890 \ --jars "/usr/hdp/188.8.131.52-1245/spark/lib/spark-streaming-flume_2.10-1.6.2.jar,/usr/hdp/184.108.40.206-1245/spark/lib/spark-streaming-flume-sink_2.10-1.6.2.jar,/usr/hdp/220.127.116.11-292/flume/lib/flume-ng-sdk-18.104.22.168.6.5.0-292.jar" \ /home/anujidgupta/python_demo/streamingFlumeDeptCount.py \ gw02.itversity.com 8123 \ /user/anujidgupta/streamingFlumeDeptCnt1/cnt
Please note flume-ng-sdk- jar is present in different version (/usr/hdp/22.214.171.124-292) as other required jars are present under /usr/hdp/126.96.36.199-1245/
WHile running I am getting empty files created.
Please suggest if issue is due to jar file location or some other issue.