Cleared CCA 175 exam today

Thanks to Durga Gadiraju sir, without his support this would not have happened.

In preparation perspective, I have attended his Big Data course in June
and have also followed his CCA 175 playlist.

My exam questions involved 2 Sqoop, 3 Hive, 2 spark python, 2 spark scala, 1 Avro file.

Pointers to future exam takers

  1. Please learn how to run a .sh file, I could not run it, so had to run each command in the sh files.
  2. The resolution is very less on the VM, so check how to increase a terminal’s font. - Please go to preferences and increase the font size
  3. Time management is important, first, attempt all the simple questions, then attempt the ones you feel difficult.
  4. Put aside 15 minutes of your time, to cross check all the answers, it is the deciding factor.
  5. The playlist material covers all the exam material adequately.

Here is the course link:

Please let me know if any other questions regarding CCA examination


Thank you Pramod, I have updated some of the details.

I don’t have knowledge on python. Is it necessary to know python to do CCA 175 certification?

Yes, there will be 2 questions in python. However you do not have to write programs in python, they will give code skeleton and you have to fill the blanks with python or scala code to invoke Spark APIs.

Hi Promod,

Congrats on clearing the certification!!.

can you please help me with ,What kind of datasets are provided in the certification, is it same data sets present in mysql or will they provide any other datasets.

Can you please share the questions that you got for below two objectives in the certification.

  1. Calculate aggregate statistics (e.g., average or sum) using Spark

  2. Write a query that produces ranked or sorted data using Spark

Mostly they give different data set. Exam takers are not supposed to get into details about questions, data sets etc :slight_smile:

1 Like

Hi, the datasets are very similar but are very huge.

1 Like

Hi Pramod,

I am planning to appear for CCA175 exam in January and have been going through your CCA175 playlist on YouTube.
I just wanted to know that for the questions in exam we are supposed to write codes or perform/execute the programs on the cluster?

And what will be checked for the marks, will it be code or output of the statements/programz that we have executed?


Hi, yes you have to write code for every question. The output and the format will be checked for the correctness of a question.


Are there any question related to json?

Whether there are questions or not, you have to prepare on json. It is important for your career.

I am referring to CCA.

Will there be a combination of Hive + JSON tasks in the exam.

We cannot tell that. As per my expectations, you will get questions on avro, parquet, sequence file formats than JSON.

Suppose say if I answered the question in CCA175 but suppose say only 50% of the answer is correct , in that case shall we get correct mark partially or it will be Zero from that answer ?

What type of questions can we expect on Hive? Everybody who posted reviews told 3-4 questions are from Hive. But I see from the syllabus actually in context of Hive there are not much topics. 3rd section focuses on Avro not on Hive as per my view.
What actually they ask on Hive?

There is no partial answer only, full answer. So it will be Zero score.


Please follow the Hive topics taught by Durga sir in playlist. So that questions will be within that scope.

I’m very confident on those. I can answer if the questions are from those what sir discussed in playlist. I haven’t worked extensively on Hive before. So little fear of clearing certification. So ,if something comes out of what sir haven’t covered I have to rewrite the exam :’(

Yes, your right, if it is from out of box, Certification will be at toss. So take enough time till you gain enough confidence. If you have scheduled your exam, before 24Hrs you can cancel and schedule it again.

Thanks Ravi for confirmation