Lec#147 - Error running KafkaStreamingDepartmentCount

apache-spark
spark-submit

#1

I am getting the following error while running KafkaStreamingDepartmentCount in the labs shell.

[jayshawusa@gw03 libs]$ hostname
gw03

Error :
18/02/27 21:02:00 INFO BlockManagerMasterEndpoint: Registering block manager wn06.itversity.com:45557 with 511.1 MB RAM, BlockManagerId(2, wn06.itversity.com, 45557)
18/02/27 21:02:01 INFO YarnClientSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (wn05.itversity.com:35561) with ID 1
18/02/27 21:02:01 INFO BlockManagerMasterEndpoint: Registering block manager wn05.itversity.com:56462 with 511.1 MB RAM, BlockManagerId(1, wn05.itversity.com, 56462)
18/02/27 21:02:01 INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
Exception in thread “main” java.lang.NoSuchMethodError: scala.Predef$.ArrowAssoc(Ljava/lang/Object;)Ljava/lang/Object;
at KafkaStreamingDepartmentCount$.main(KafkaStreamingDepartmentCount.scala:18)
at KafkaStreamingDepartmentCount.main(KafkaStreamingDepartmentCount.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
18/02/27 21:02:01 INFO SparkContext: Invoking stop() from shutdown hook
18/02/27 21:02:01 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null}

My build.sbt
name := "retail"
version := "1.0"
scalaVersion := “2.11.0”

libraryDependencies += “org.apache.spark” % “spark-core_2.10” % "1.6.3"
libraryDependencies += “org.apache.spark” % “spark-streaming_2.10” % "1.6.3"
libraryDependencies += “org.apache.spark” % “spark-streaming-flume_2.10” % "1.6.3"
libraryDependencies += “org.apache.spark” % “spark-streaming-flume-sink_2.10” % "1.6.3"
libraryDependencies += “org.scala-lang” % “scala-library” % "2.10.6"
libraryDependencies += “org.apache.commons” % “commons-lang3” % "3.3.2"
libraryDependencies += “org.apache.spark” % “spark-streaming-kafka_2.10” % “1.6.3”

Note : I tried both 1.6.2 and 1.6.3 for spark-streaming-kafka. I get the same err

My Command :

spark-submit --class KafkaStreamingDepartmentCount
–master yarn --conf spark.ui.port=12569
–jars “/usr/hdp/2.5.0.0-1245/kafka/libs/kafka_2.10-0.8.2.1.jar,/usr/hdp/2.5.0.0-1245/kafka/libs/spark-streaming-kafka_2.10-1.6.2.jar,/usr/hdp/2.5.0.0-1245/kafka/libs/metrics-core-2.2.0.jar”
/home/jayshawusa/retail_2.11-1.0.jar yarn-client nn01.itversity.com:2181,nn02.itversity.com:2181,rm01.itversity.com:2181

Conf File
[jayshawusa@gw03 wslogstokafka]$ cat wskafka.conf

example.conf: A single-node Flume configuration

Name the components on this agent

wk.sources = ws
wk.sinks = kafka
wk.channels = mem

Describe/configure the source

wk.sources.ws.type = exec
wk.sources.ws.command = tail -F /opt/gen_logs/logs/access.log

Describe the sink

wk.sinks.kafka.type = org.apache.flume.sink.kafka.KafkaSink
wk.sinks.kafka.brokerList = nn01.itversity.com:6667,nn02.itversity.com:6667,rm01.itversity.com:6667
wk.sinks.kafka.topic = fkdemojayshaw

Use a channel wkich buffers events in memory

wk.channels.mem.type = memory
wk.channels.mem.capacity = 1000
wk.channels.mem.transactionCapacity = 100

Bind the source and sink to the channel

wk.sources.ws.channels = mem
wk.sinks.kafka.channel = mem