How to access spark shell(scala/python) in CCA175 exam?

Hello All,

I am preparing for CCA175. I have installed Quickstart VM on my lap and started practicing. The way i interact with spark-scala is via spark-shell(spark-shell --master local) and for python(pyspark)

In the case of exam -

1)Can i access the spark scala and python shells on the client machine? any one who attempted exam can tell me how can i access spark shells in multinode cluster?

  1. Will they be providing the access to hue? If yes will they give host details and port no?

Hi @Raja_Shyam, in the CCA exam you will not need to access spark and scala shells on the client machines. Even if you would like to access, it would be the same way you have done on your local Cloudera VM.

In the exam, you would be required to write Python and Scala code and run the .sh file that contains the command to run the program.

Thanks @pramodvspk , If there is any sample for running spark programs via .sh can you share me and for running .sh files do we need to explicitly set path variables in exam?

@Raja_Shyam - Please go through this

Thanks @gnanaprakasam I have seen this video but it shows only for sqoop and hive. How can i run it for spark programs?

In my local i am practicing via spark-shell. If there is a way via .sh can you suggest pls…

@Raja_Shyam - I haven’t tried .sh for scala/python.

For Phython, we can save it has “.py” and submit the job using spark-submit command. Below is the reference.

@pramodvspk - Promod any input on this ?

@Raja_Shyam, a .sh file contains commands that can be executed on a command line. So it does not matter if you are running commands for Sqoop or Hive or spark python or spark scala.

Please go through the video and try to run a .sh file, it is something you have to know before attempting the CCA exam.

Hi @Raja_Shyam,

In exam you can access pyspark shell using ‘pyspark’ and spark shell with scala context using ‘spark-shell’ in terminal. You can run the transformations and actions you want for the program.

Thanks @pramodvspk. I shall try as mentioned in above link…

Thanks @ravi.tejarockon. Can i access it from the client machine?

But how about scala source ? I know with python via spark-submit you can run from the source file as .py file, but with scala you have to pass the jar extension of compiled scala program.

In the exam we have to compile the source, build the jar ?

I know we have the sbt tool to help us, but this will be avaliable at the exam ?


@Raja_Shyam, your spark-submit command goes to .sh file and then you can invoke the script. As per my knowledge you do not need to write shell script with .sh extension. They might give shell script which will have spark-submit command and you just have to invoke that script.

Executing shell script is same, it does not change from one technology to other. Content of shell script should be valid command which can be interpreted by shell at the time of execution.

1 Like

@Raphael_L_Nascimento There is only one generic method to run scala/python programs i.e. run the .sh program that is already prepared for you. No compilation, passing parameters required, everything is there in the script. Just a simple run .sh.

1 Like

Thank you so much. @itversity.

So we can run it via ./

1 Like

Yes. ./ using relative path from current directory. File has to be present under the directory where it is running.

It is good practice to give fully qualified path (for eg: /home/user_id/and_other_sub_dirs/

I understand that we have to choose either SCALA or Python as programming language,

Correct me if I’m wrong?

Ok…Thanks @itversity:smiley:

Hello, I am practicing my spark in bigdata-labs of itversity. Will it make any difference while taking CCA exam since i am not using cloudera VM?

@Raja_Shyam there is not so much technical jargon, just a virtual machine with all the tools installed. Run them just like the bigdata-labs. and for sparks shells refer my commands.

no difference at all.