Cleared CCA175 on 12-Mar-2018



Thanks Raman for all your suggestion. Can you share link for practice exercises that you are talking about.



Hi @Divyakot

Those are not at one place. Some are in videos and others you can go through posts here only. Check categories like spark-exercise etc. You can go through Arun Blog for more insight regarding certification problems.


Thank you very much for sharing the link Raman @Raman_Kumar_Jha


Hai @Raman_Kumar_Jha,
I completed the exam yesterday but I could not clear the exam as I did some silly mistakes.Can u tell me what port number should we use for sqoop problems during the exam?


yes Sublime text is there in exam environment. It is better to use it as if there is any mistake in your queries then you can edit it easily.

They will mention all the mysql connection details in the problem itself. Just use it.

They will describe everything in the problem statement. If they want something specific they will definitely mention it.


It`s okay @karthick_raja

You can try another time. Actually I have also committed many silly mistake in exam time but corrected it in time specified.

If port no is mentioned in problem then only you have to use it. We have to only use mysql connection parameters whatever specified in problem statement.


Hi @Raman_Kumar_Jha,

Congratulations on clearing the certificate.

I have two questions.

  1. What are the 96 scenario problems you practiced? where can I get those problems?
  2. Will there be questions that need to be answered in pyspark only using python? I dont have knowledge of python, I use scala for programming


@Raman_Kumar_Jha Can u tell me what host name we should use in the scoop command executed during the exam?


Sorry, to hear that @karthick_raja but it would be very helpful if you can list all the possible mistakes that someone can make based on your experience. That would be very helpful to all other aspirants.

@Raman_Kumar_Jha, If possible you can also suggest some mistakes that someone should avoid during exam.



Thanks @naveenraj

You can attempt problems in any language you want. No specific language is mentioned. They just want result in a particular way they have asked.

96 scenarios can be helpful. If you are satisfied with your preparation you can go through it as well. Those will certainly help. PFB the link


@karthick_raja You can go through the demo video once at the official certification page, you will definitely get your answer. I think you did mistake with connection parameters in Sqoop. Just go through that video, It will help you :slight_smile:


@Divyakot Some of the mistake I have commited and My suggestions:

  1. Got nervous at initial stage but I kept my self believe that I can do it. I took me 30 min. for 2 sqoop questions and at that time I am at a stage that I can not clear certification now.

  2. Always copy the output and input path from the problem statement

  3. If stuck somewhere give a one minute time to think for the solution. Have calm and think for the solution.

  4. Go to other logic if one is not working.

  5. Skip the question immediately if you already wasted 10 min. in it. Go to next question, Many are easy ones also.

  6. Practice question and different exercise well. Even if you know what is the solution, Practice it. It will enhance you confidence.


@Raman_Kumar_Jha I was getting the “list index out of range” error in many questions during the CCA175 certification exam.Can you tell me why this error occurs?


I wonder who can post the reading and saving examples of text file, compressed format, avro, sequence here?

Arun’s blog didn’t provide an example.

Avro has been mentioned many times but how can I practice that in the Lab? I got error:
scala> import com.databricks.spark.avro._
:25: error: object databricks is not a member of package com
import com.databricks.spark.avro._


@karthick_raja I am not sure but if there are four fields only and you are using 4 as a index for last field then it can occur as index start from 0 when we split our files with any delimiter.

I am not sure you are doing this mistake or not.



I am also getting same error message

@Vani_Cs can you see the above user also getting same error as i got

Can you please try fix this



@Raman_Kumar_Jha Thank you very much for your reply.


@paslechoix @ethabhi2795

As labs are in Hortonworks distribution, For Avro files, you have to invoke spark shell with avro package which will be as below:

spark-shell --packages com.databricks:spark-avro_2.10:2.0.1 --master yarn --conf spark.ui.port=22222

After this:

import com.databricks.spark.avro._

Now run DFname.write.avro(“ouput_path”)

But In exam as it is on Cloudera distribution, You only have to use:

import com.databricks.spark.avro._


Thank you @Raman_Kumar_Jha for this.

One more question:-
Does below command be enough to launch multiple spark-shell window:
spark-shell --master yarn

If not what command should I use for CCA 175 exam VM.



@Divyakot I would suggest to launch only one spark-shell only in exam otherwise it will low the performance of environment and your command is perfectly fine for that.

I never tried launching more than one spark-shell window but I think You can try by using:
spark-shell --master yarn --conf spark.ui.port=22222[Any 5 digit no]