Not able to run spark-application jar in lab

Am not able to run spark application jar in lab .

For more info . Please find attached snapshot

@jayantm1988 - Are you giving all the parameters in argument ? It’s look like you have only input path as args.

Did i miss something ?

spark-submit
–class WordCountSimple
–conf spark.ui.port = 22222
wlabs 2.10-1.0.jar /public/randomtextwriter/part-m-00000

Still am getting same issue , anybody can help ?

@jayantm1988 - Could you please provide scala code, application.conf and build.sbt code, to produce the issue.

scala code:

/**

  • Created by jaymishr on 4/2/2017.
    */
    import org.apache.spark.{SparkContext,SparkConf}

object getRevenue {

def main(args:Array[String]):Unit=
{
val ordersFilePath = args(0)
val ordersItemFilePath = args(1)
val conf = new SparkConf().setAppName(“getRevenue”).setMaster(“local”)
val sc = new SparkContext(conf)
val ordersObj = sc.textFile(ordersFilePath)
val ordersFilterRec = ordersObj.filter(rec => rec.split(",").last == “CLOSED” || rec.split(",").last == “COMPLETE”)
val ordersTupledObj = ordersFilterRec.map(rec => (rec.split(",")(0).toInt,rec.split(",")(1)))

val ordersItemObj = sc.textFile(ordersItemFilePath)
val OrdersItemTupled = ordersItemObj.map(rec => (rec.split(",")(1).toInt, rec.split(",")(4).toFloat))
val joinDataObj = ordersTupledObj.join(OrdersItemTupled)
val filterData = joinDataObj.map(rec => rec._2)
val finalData = filterData.reduceByKey((acc,n)=>acc+n)

}
}

build.sbt:

name := “retail”

version := “1.0”

scalaVersion := “2.10.6”

libraryDependencies += “mysql” % “mysql-connector-java” % "5.1.37"
libraryDependencies += “org.apache.spark” % “spark-core_2.10” % "1.6.2"
libraryDependencies += “com.typesafe” % “config” % “1.3.1”
-----------------------------------------------------------------------------------
application.properties
dev.executionMode = local

prod.executionMode = yarn-client

spark-submit
–class WordCountSimple
–conf spark.ui.port=22222
wlabs_2.10-1.0.jar /public/randomtextwriter/part-m-00000

Changed port number also , still getting same issue .
any solution ?

@jayantm1988-
Your code shows for revenue calculation but spark-submit for wordcount ?

use below code in application.properties
prod.executionMode = yarn-client

pass prod as argument while executing spark-submit.

Sorry pasted wrong code , correct code is
scala code:
/**

  • Created by jaymishr on 3/30/2017.
    */

import com.typesafe.config.ConfigFactory
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext

object WordCountSimple {
def main (args:Array[String]) :Unit=
{
val inputPath = args(0)
val props = ConfigFactory.load()

val conf = new SparkConf().setAppName("Word Count Simple").setMaster(props.getConfig(args(1)).getString("executionMode"))
val sc = new SparkContext(conf)
val textFileObj = sc.textFile(inputPath)
val tupleObj = textFileObj.map(rec => (rec.split(",").last, 1))
val finalTuple = tupleObj.reduceByKey((acc, n)=> acc+n)
val resultWithTab = finalTuple.map(rec=> rec._1+"\t"+rec._2)
val scalaSeq = resultWithTab.collect()
scalaSeq.foreach(println)

}
}

application.properties:
dev.executionMode = local

prod.executionMode = yarn-client

build.sbt:
name := “wlabs”

version := “1.0”

scalaVersion := “”“2.10.6"”"

libraryDependencies += “mysql” % “mysql-connector-java” % “5.1.37”
libraryDependencies += “org.apache.spark” % “spark-core_2.10” % “1.6.2”
libraryDependencies += “com.typesafe” % “config” % “1.3.1”

spark-submit
–class WordCountSimple
–conf spark.ui.port=22123
wlabs_2.10-1.0.jar /public/randomtextwriter/part-m-00000 prod

For more info please find attached snapshot

If you are trying to process medium to large size datasets, you need to run in yarn mode.

  • Make sure you setMaster to yarn-client and then use spark-submit with --master yarn

Even in yarn mode also , am getting same issue ,

Can you paste your code here?

/**

  • Created by jaymishr on 3/30/2017.
    */

import com.typesafe.config.ConfigFactory
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext

object WordCountSimple {
def main (args:Array[String]) :Unit=
{
val inputPath = args(0)
val props = ConfigFactory.load()

val conf = new SparkConf().setAppName("Word Count Simple").setMaster(props.getConfig(args(1)).getString("executionMode"))
val sc = new SparkContext(conf)
val textFileObj = sc.textFile(inputPath)
val tupleObj = textFileObj.map(rec => (rec.split(",").last, 1))
val finalTuple = tupleObj.reduceByKey((acc, n)=> acc+n)
val resultWithTab = finalTuple.map(rec=> rec._1+"\t"+rec._2)
val scalaSeq = resultWithTab.collect()
scalaSeq.foreach(println)

}
}

still am not able to run jar :frowning:

@jayantm1988 - Please find below piece of code which is working fine.

application.conf

dev.executionMode = local
prod.executionMode = yarn-client

build.sbt

name := “wc”

version := “1.0”

scalaVersion := “2.10.6”

libraryDependencies += “org.apache.spark” % “spark-core_2.10” % “1.6.2”

libraryDependencies += “com.typesafe” % “config” % “1.3.1”

scala code

import com.typesafe.config.ConfigFactory
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext

object wc {

def main(args: Array[String]): Unit={
val props = ConfigFactory.load()
val conf = new SparkConf().setMaster(props.getConfig(args(2)).getString(“executionMode”)).
setAppName(“word count”)
val sc = new SparkContext(conf)
sc.textFile(args(0)).
flatMap(rec => rec.split(" “)).
map(rec => (rec, 1)).
reduceByKey((agg, value) => agg + value).
map(_.productIterator.mkString(”\t")).
saveAsTextFile(args(1))
}
}

sample spark submit command

spark-submit
–class wc
–master yarn
–conf spark.ui.port=22222
wc_2.10-1.0.jar /user/gnanaprakasam/wordcount.txt /user/gnanaprakasam/wordcountoutput prod