On (Spark-submit) submitting this we are getting an error




On submitting this we are getting a error as mentioned below
Error- Exception in thread “main” java.lang.ClassNotFoundException: com.mysql.jdbc.Driver
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)

Spark submit command -

spark-submit \
--master yarn \
--executor-memory 512m \
--total-executor-cores 1 \
--class com.company.HelloSrini.TestColumn \

We have hardcoded the driver as com.mysql.jdbc.Driver
Please suggest on the same.


You need to use --jars and pass mysql jar as part of spark-submit command.

By default mysql jars are not available in Spark. For example I have launched spark-shell with out passing mysql jars as part of --jars

spark-shell --master yarn \
  --conf spark.ui.port=12890

Here is the error when I try to import com.mysql.jdbc.Driver

scala> import com.mysql.jdbc.Driver
:25: error: object mysql is not a member of package com
import com.mysql.jdbc.Driver

Pass mysql jdbc jar while launching spark-shell.
Here is the example with spark-shell using jars to load mysql jar

spark-shell --master yarn \
  --conf spark.ui.port=12890 \
  --jars /usr/hdp/ \
  --driver-class-path /usr/share/java/mysql-connector-java.jar

Import as part of the code, see below message when I try to import it did not throw the exception.

scala> import com.mysql.jdbc.Driver
import com.mysql.jdbc.Driver

Here is the code which is working after launching spark-shell

val jdbcDF = sqlContext.read.
  option("url", "jdbc:mysql://ms.itversity.com").
  option("dbtable", "retail_db.departments").
  option("user", "retail_user").
  option("password", "itversity").

Solution is tested on our state of the art big data cluster where mysql jar is available under sqoop libraries in specified path