CCA175 Cleared on Jan 21th 2018


Thank you Durga and Arun Kumar for your videos and exercises. all the learning materials provided by you are so great, more than enough to pass the exam.

I passed 8 out of 9 (2 sqoop, 7 spark-shell), not sure why one of the spark question failed. all questions are easy. do not worry if you know all the concepts in Druga Sir’s videos/exercises and you can finish the mock questions in Arun’s blog in time and accurate, you are good to go.

my tips are (1) better to have a big monitor for the exam. (2) pay attention to the output requirements.question is easy, but if you miss any output requirement, you are finished, so validate your output file(location, file format, file content, compression, record count…) carefully. if you have enough time, you can use both HDFS commands and Spark-shell to check/reverse check your output.



Hi @xuyoumi,

Congrats !!!
did you used scala or python ?
I’m not able to find write function for sequenceFile in Python.



During the exam, no template was provieded, i use spark-shell scala only, you can use what ever you like.

Better to know both scala and python in case they provide template.


During certification Should we use spark-shell directly or should we set driver-memory, executor-memory and master as yarn and then num-executors and executor-cores as well.

Is course from Udemy needs to be followed or I can go ahead with the you tube videos “Hadoop and spark developer as per revised syllabus”? I have access to both. please suggest


I just start spark-shell using: spark-shell --master yarn.

I followed videos “Hadoop and spark developer as per revised syllabus”, if you have access to Udemy course, you should also follow,you should practice as much as you can, practice is very important.



Do you feel any need to check cloudera manager during your exam?


the exam environment already been set up. so you do not need worry about environment configurations


@xuyoumi congrats!!!

I need to ask you one question regarding the exam and I’m afraid that will disclose the content, so would be so kind and ping my below email so we can exchange knowledge please. This is urgent as I’m taking the exam in 1 day.


Hi @xuyoumi Congrats
How much of RDD transformation techniques were required in order to solve the problems. I am more comfortable with dataframes and sparkSQL. Can we solve all spark problems without using RDDs and just by relying on DF or SQL?
Thanks in advance.