Error in cygwin while running spark on windows 8.1

@itversity
I am following Scala and spark workshop playlist. As shown on videos, while running the spark jobs on cygwin after sbt console, spark actions are giving error. It is giving error

Could not locate executable null\bin\winutils.exe

and also It is giving error

java.lang.NullPointerException
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)

Kindly suggest solution for this.
Thank you.

Hi Chakote

Hope it solves your issue.

Download winutils From Below Link

Go to Link and click download

https://github.com/steveloughran/winutils/blob/master/hadoop-2.7.1/bin/winutils.exe

Step 1: Create adirectory in c:

mkdir -p winutils/bin

step 2: Copy and paste winutils.exe from downloads to C:/winutils/bin

step 3: update in environment variable and set path

Goto start --> Edit environment variable for your account —> user variables

click New --> variable name (type HADOOP_HOME) and value (C:/winutils/bin)

Then click ok

Regards
Venkat

Thank you @venkateshm.
By above procedure first error is not coming but still second error is coming I.e

java.lang.NullPointerException
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)

Kindly help about this.
Thanks

Hi Chakote,

Am not clear with your error, I want to know what job you running.

Regards
venkat

Thanks Venkat for reply.
Pls see my code

import org.apache.spark.{SparkConf,SparkContext}
val conf = new SparkConf().setAppName(“daily_revenue”).setMaster(“local”)
val sc = new SparkContext(conf)
val orders = sc.textFile(“c:\Users\Chakote\Downloads\data-master\retail_db\orders”)
orders.take(5).foreach(println)

Starting part of error is as below

java.lang.NullPointerException
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:404)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:678)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:661)
at org.apache.hadoop.fs.FileUtil.execCommand(FileUtil.java:1097)
at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSy stem.java:567)

Thank you.

1 Like

Hi Chakote

If you run in local mode, you have to append “file type” in input path, then spark fetch it from local. You missed that.

Try this code

import org.apache.spark.{SparkConf,SparkContext}
val conf = new SparkConf().setAppName(“daily_revenue”).setMaster(“local”)
val sc = new SparkContext(conf)
val orders = sc.textFile(“file:\\C:\Users\Chakote\Downloads\data-master\retail_db\orders”)
orders.take(5).foreach(println)

output

1,2013-07-25 00:00:00.0,11599,CLOSED
2,2013-07-25 00:00:00.0,256,PENDING_PAYMENT
3,2013-07-25 00:00:00.0,12111,COMPLETE
4,2013-07-25 00:00:00.0,8827,CLOSED
5,2013-07-25 00:00:00.0,11318,COMPLETE

2 Likes

Thanks Venkat.

I found the solution for the about this. My laptop was having different version of scala and project was setup with different version. After making both, I am not getting the error. Thank you for your help.

Best Regards.

2 Likes

Also facing those issues.
Resolved by:
1- winutils path
2- scala version 2.10.6