Passed CCA 175 on 10 Sep 2020

Hi All,

Just wanted to share that I have received mail from Cloudera today(after 13 hours approx.) that I have passed the CCA 175 certification. I got 7 pass and 2 were marked as number of records not matching. Most probably because i used distinct in the query.

Regarding exam, the environment was working fine. Copy/paste was using right click. I opened 2 different instances of terminal. I used pyspark on one and hdfs commands on another to check the files are getting saved or not. Tried to launch using pyspark2 but it was not working and used pyspark then which was set to used pyspark version 2 by default.

Questions were easy, mostly converting one format to another with some basic transformations like using substring or concatenating.

I prepared using Raju Sir’s Udemy course and then practiced questions using some other course in udemy.

Now waiting for Cloudera to share the certificate and license number.

1 Like

Congratulations Harmeet,

I have my exam on 28th . Can you please give some pointers on

  1. how was the environment. slow/fast?
  2. Do you need to import the avro package
  3. can we use any editor like notepad++ on our desktop to write codes and then paste in the shell
  4. in any case if the interned goes down … can we again login back

Thanks a lot,

Hi Anubhav,

  1. The environment was not slow. It was working fine for me.
  2. It was no written to import the avro package but I still did it in the beginning to avoid last minute panic.
  3. There was an editor on the desktop itself but I did not use it. You can use it by using right click copy and paste.
  4. I am not sure on this as luckily it did not happened to me. I had kept a phone with hotspot on as backup but the monitor asked me to switch off my phone in front of camera.
1 Like

Hi Harmeet, congrats!

Did you face any issues with the capslock or shift key ? I had to get my exam rescheduled twice due to issues with the capslock and shift key.

Hi Harmeet,

Congratulations… I am preparing for the certification and will be giving the same in coming week. Could you please let me know which course did you take for practise questions as i could see the available courses are having sqoop included which is not a part of updated syllabus.

I bought a course that was provided by navdeep kaur.

I dont remember facing any such issue.

Hi Congratulations, I am planning take this exam in next week. Could you please clarify my questions below ?
1.when you save the data into parquet/orc did you add this option compression as uncompressed ? If suppose they not mention any about compression in that question then We should not compress it right ? Because if we wouldn’t use compression option, by default it uses snappy format for orc/parquet.
2. While saving data into text file did you include header into it ? Or you just save the data without header column?
3.when you were dealing with SQL join. Did you get any duplicate records ? If yes did you use distinct to resolve duplicates ? Can we use distinct if they not specify anything to handle duplicates in questions?
4. Do they mention no of records would expected in outcome results for all the questions ?
Please clarify my questions and your answers would be helpful for me . Thanks in advance

Hi Arun,


  1. If any kind of compression is required they will mention in the question. Dont use compression option otherwise.
  2. I did not included headers while saving in the text format
  3. Don’t use distinct until asked, i think i did the same mistake of being over smart by using distinct and that’s why 2 of my answers were marked as number of records not matching.
    4)No they wont mention the number of records expected.

Best of luck!

1 Like

thanks harmeet for your response.

Hello Harmeet,

quick check

  1. do we need to save the code also in some directory?
  2. What is the original test level as compared to Navneet test series.
  3. when saving file in TEXT format , did you use CSV or text option.
    Thanks again,

I apperared for the exam this week but due to the VM issue from the cloudersa it been re-sceduled… and exam is sceudeled for this weeek . in 10 mins i tried to open the shell pyspar2 didn’t worked but pyspark worked. i just wanted to confirm if its run on spark 2+ or spark 1.6 in pyspark shell.

also, last question can we save the file in csv instead of converting to rdd and then to txt ?