Exams tips for CCA 175 takers


Hi All,

I am Pramod Sripada, Masters Computer Science student at Indiana University Bloomington. I am also a Cloudera Certified Apache Spark and Hadoop Developer. I have given my CCA 175 exam recently on 7th November 2016. I have been getting many queries about the examination from a lot of exam takers, so I have decided that I share some tips which would be useful for future CCA 175 exam takers.


  1. Durga sir’s CCA playlist covers the syllabus adequately, if you have prepared it, then you will find the exam easy.
  2. For revision, if you feel that the playlist is too long, go through the private training playlist here: https://www.youtube.com/playlist?list=PLf0swTFhTI8rPoYMMZGs44FZX4qGbfjTu, which covers almost all the topics for the playlist but not in the level of depth as the original CCA playlist.
  3. I strongly recommend a final revision by going through CCA course on itversity website. It has lots of examples demonstrated and reading the text will help to revalidate your knowledge.
  4. Be strong with Hive and Sqoop basics, practice as many examples as possible in different scenarios.
  5. Evolve Avro schemas, by imagining hypothetical scenarios, and practice it till you are confident. Make sure you are aware of all the Avro data types and you can evolve the schema.
  6. Practice partitioning as mentioned in the syllabus, try partitioning with different columns and understand the nuances behind it.
  7. Spark consists of two sections with Scala and Python, the questions will be of a basic standard with filling in the blanks, but do practice all the questions in the playlist, it will give you the confidence to tackle simpler questions.

Exam Delivery:
During the exam, I faced some issues and wasted almost around 30 minutes in my examination and panicked. Some tips for a smooth exam would be

  1. Please go through the videos towards the end of the CCA playlist that covers the common issues faced by exam takers, it is very important please don’t neglect it.
  2. Try to login into a Ubuntu machine and become familiar with how to use it.
  3. Learn on how to increase the font size of terminal (Go to preferences on the left top corner)
  4. Learn on how to run a .sh file, in case if you are not able to run it, you can open the file using vi editor and run each of the commands individually.
  5. Make sure you go through the question completely and then start coding, as the expected output might be different than what you thought.
  6. Validate the output, before moving to the next question helps if in the last minute you are not able to verify your answers.
  7. Set aside 15 minutes of time in the end and verify all your answers.
  8. If the time is over, don’t leave any unsaved files and quit the examination, save all files and then end the examination.

Hope the tips would help you in your certification journey. Thanks @itversity for such an amazing job.

CCA-175 : Query related to Aggregate ( avg or sum) questions

Thank you very much @pramodvspk for pulling out all the information in nice manner.


Thank you sir @itversity


As for Spark based questions on Python & Scala the code snippet will be given and we have to fill it with proper Spark API. But will there be any need to use the IDE like Eclipse or IntelliJ because to configure it with Spark lib itself will take time. So did you ever felt like using any IDE during the exam. ???


No @aijazmallick you do not need to use IDEs. It is not mandatory.
Keep it simple and try using sublime text or vi editor - what ever is comfortable to you.


@aijazmallick, it is not at all required, you can do it in VI editor itself. My suggestion would be to use sublime for hive DDL as it will be easy and less error prone.


sir is 2 months enough for preparation ?


Yes, it is more than enough for the certification.


Hey Pramod,
Will this playlist is good enough for CCA developer program.


@RaghavendraKumars yes it is sufficient.


Go through this one - http://www.itversity.com/courses/cca-spark-and-hadoop-developer-certification/

Same videos but with text content.


yes, going through it. Thanks Durga


Hello All Certified Members,

As mentioned in the videos that there will be no restrictions as such in the exam for solving questions using SQLContext or HiveContext basically using SQL … So Did anyone tried that …?? (Using SparkSql & not the default pyspark core)


@aijazmallick, firstly the Spark questions are very very straight forward that you don’t have to use SparkSQL. Even if you want to use SparkSQL I don’t think you will have so much time left to play with SparkSQL.


will they ask questions on MR? and i don’t have a knowledge on phyton. But i know spark not into depth. Can i clear the exam with this knowledge?



I wanted to know that in exam like Durga Sir mentioned that we can try running the code on the shell before running the script. So do they provide the Expected Result/Output or how its going to be like :: So suppose we have a script in which we have to fill the correct spark api. To check whether it is running properly we executed it on the prompt and we got some output. But the expected output might be little different. So how to check or validate that the script we are running is giving the expected results.


@aijazmallick, you have to cat to the HDFS location and check for the output. There is no other way you can normally validate it.


No, Map Reduce is not in the scope of CCA Spark and Hadoop Developer.


so instead of doing a take action Can we run the script & store the result on to HDFS …then while checking if we see that its not as expected can we remove that output directory from HDFS & rerun our shell file after changing the code.??


Where can i find sublime editor in cloudera distribution? Is it on descktop? or any other place?