Cleared CCA 175 on Sep 21st


Cleared my CCA exam today on my second attempt. Thanks to Durga Sir’s youtube videos and Arun’s blog which helped me to prepare for the certification.
There are 2 sqoop, 1 hive metastore, and 6 spark questions. All are easy and straight forward. I listed few tips below, hope it helps.

  1. Practice problem 4&5 in Arun’s blog at least twice to get familiar with sqoop command and file formats.

  2. Practice all the questions in and If you can solve these problems.. you may be ready for CCA-175 . Give it a shot!

  3. If you prefer to use dataframe( sqlContext.sql), get familiar with the common SQL built-in functions, like sum(),avg(),substring(),concat(), date functions…

  4. Time management is important. Use copy paste wisely, it will save a lot of time. I have reused the below template in 2-3 spark questions.
    val r=sc.textFile(" “)
    val rf =>{val y=x.split(” “);(y(0),y(1),y(2),…)}).toDF(” “)
    rf.registerTempTable(” “)
    val result = sqlContext.sql(” ")

  5. Input file field delimiter won’t be provided, we need to check it by ourselves use
    hdfs dfs -cat /aaa/bbb/ccc/eee|head
    hdfs dfs -tail /aaa/bbb/ccc/eee


FYI, I have wrote down why I failed the first attemp.
My first try is on Sep 15th, which I only passed 5 questions. Here is the reason why I fail.
a. For some reasons, the sqoop import keep failing, though it is a very simple import. After the exam, I have rerun the same commands in itversity lab, it succeed without any issue. I still don’t know what I did wrong in the exam.
e. I tried to join 2 temp tables by using sqlContext.sql(" select …"), it kept failing and threw some wired error message. I have waste 20 mins on this question.
c. The input data directory in one question doesn’t exist. It gave */A as input data, but i only found */A1 and */A2, both of them have files inside. I struggled long time on which folder should be used, and waste another 20 mins.
I contacted Cloudera about the wrong input data directory, they verified the issue, and give me a free retaken. I still feel something is wrong in their system at that day, which caused the sqoop import and spark sql join fail. Did anyone get the similar issues on Sep 15th exam?

  • Click here for signing up for our state of the art 13 node Hadoop and Spark Cluster

Cleared CCA175 on 29th September

what do you mean by hive metastore question?


I know it is a frequently asked question, I also have searched this in the forum, but there are too many different response to this one. I learned this from Durga Sir’s Udemy CCA175 python course, should I prepare scala for the CCA175 Exam also? Cause someone say it will have template written either scala or python, so better to know how to write both, it that right or I can fully concentrate in prepare exam only in python?


Check problem6 in Arun’s blog


“Cause someone say it will have template written either scala or python”, i think that is based on the old syllabus. There is no “fill blank in template question” in current exam. You can use either scala or python, only the output files matter.


Thank you sincerely, really


Can you share your whatsapp number or email id ?


Hello Tigerwyue

How did you contact cloudera about your issue for the first attempt. Can you please share the details even I faced similar problems today in my test.



You can email They will help to identify the issue.


Hi tigerwyue,
I recently had the exam CCA175 and failed :frowning_face:. I spent a lot of time in 1st two questions of Sqoop import/export. Unfortunately, Sqoop job failed badly all the time I tried :frowning: questions were easy but still I don’t know why it failed. I am clueless about this. I prepared the exam by following ITVersity and Arun’s videos. I did well with spark questions though. But due to time constraint, I could not able to solve many.
It would be very helpful If you have any suggestion or guidance for me. I again want to appear for the exam but afraid of the Sqoop questions because still, I don’t know why the failure happened.



Don’t worry juleemishra, I believe you will pass the exam next time. Try to complete all the easy spark questions first, then work on the sqoop part. Even you cannot complete all the 9 questions, if you can finish 7 or 8, there is still highly possibility to win the certification. Before I took the exam, I used the stopwatch to track how much time I spent on each practice question, for the easy sqoop and spark question, keep it in 5-10mins, and for more complex one, finish in 15mins. Practice practice and practice, you will pass for sure.