Cleared CCA 175 on 10/13


Hello Everyone,

I successfully cleared CCA 175 on 10/13, thanks to Durga sir’s lectures and labs, also thanks to Arun’s blog which helped me a lot during the exam. I got 2 questions from sqoop and 7 questions from spark.

Here are few pointers for everyone who are appearing for the certification.

  1. Before exam day, please watch cloudera’s sample question video, you will understand how to adjust the screen size and how to copy paste using keyboard shortcuts which will help you immensely during exam. The video is at the bottom of the page.

  2. Do not worry about the processing time of the code, as the virtual machine runs on a cluster, the processing will be fast and maximum processing time will be around 2 or 3 mins, during which you can think about the next question.

  3. sqoop export is as important as sqoop import, please try to remember all options for both import and export commands

  4. Stick to writing the queries based on a single api, either dataframes or sqlContext or RDDS, if you are strong sql skills I would recommend using sqlContext for each spark question

  5. The virtual machine will be slow(depends on your internet connection), so I would advice use copy and paste as much as possible and type the rest. The entire examination duration will depend on how fast you write the queries.

6)Keep atleast 15 - 20 mins for verifying the output, it will give you a second chance, if you miss anything.

7)Once you finish Durga sir’s lectures practice questions from Aruns blog. He has covered all the topics, FILE CONVERSIONS and COMPRESSIONS techinques are extremely important.

  1. spark questions are relative easy if you practice Durga sir’s questions and Arun’s blog. But writing the code would take time and your performance mostly depends on how fast you write and execute your queries.

  2. Finally, open separate tabs for hdfs commands, spark shell, hive shell, mysql databases, which will help to navigate easily and verify your outputs.

Please let me know if you need any help or any questions regarding the examination.

A big thanks to Durga sir again and good luck for everyone who are appearing for the exam, you will definitely crack it.

Thank you,


Congrats Arun.
Can you please let me know if we need to use databricks package anywhere in the exam?



Yes I have used databricks package while dealing with avro files.


Thanks Arun. Can you please confirm if we need to write any commands to import databricks packages like



1)For Spark questions, do we need to write full Scala program and execute at a time or we can execute each line in the spark-shell on VM?
2) If my code is wrong, would I know immediately while executing ? will we get chance to correct it and re-submit ?


I executed single command one after the other on spark-shell, I have not tried writing full program and executing it.

If there’s a syntax error, you will get to know immediately. but if its logical error, you will get to know only after verifying the output. Your performance only depends on the output for the given questions, it does not depend on how you approach the solution.


Congrats Arun. Did you use sublime editor? If available in CDH, how did you launch it?


Yes, I used sublime editor for writing the queries. Sublime editor shortcut is located on desktop, you can launch it from there.


Hi Arun,

is the gateway host for connecting to mysql database mentioned clearly in the question. I heard few members faced issues connecting to database?
Did you get any questions where you had to use hive as source or sink?


Yes, the gateway node will be clearly mentioned in the question. I dont think there will be an issue while connecting to mysql unless there’s a syntax error in our command.

I didnt get any questions from Spark Streaming topics .


Hi Arun,

Congratulations and Best Wishes :slight_smile:

I have few silly questions , could you please help -

  1. how to connect to cloudera cluster using gateway node in the exam environment - do we need any softwares in our local machine to connect to or we will be provided a direct URL along with credentials.
  2. mysql port number – will that be mentioned in the question as well. If it is not mentioned, how can I find that out ( default being 3306)

Thanks & Regards