Passed CCA 175 on July/21/2020

Hi All, I have cleared my exam with 9/9. Big Thanks to Durga and his lab.

I would like to add few points:

  1. Very important!!! It seems like environment was changed again (cloudera removed video about their latest demo of environment) and I worked with very slow cluster. The ctrl+c/v didn’t work - this is bad news. The good news - I could use notepad++ and another terminal window for “hdfs dfs” commands and it looks better then it was in their demo.
    I observed bad behavior with using ctrl+c also!

  2. This labs enough for the success. All topics were covered completely. I spent 1.20 hour for solving all tasks because I practiced a lot. The tasks were more easy then labs.

  3. I also used Arun’s blog for preparing, but now it’s wasting time (As I’ve already said this labs enough)

  4. I got a lot of tasks for reading data in text format with separator ( you should pay more attention for this), and some tasks for converting data to different formats such as parquet/orc/avro/text with some compression options. I also had some tasks with reading and writing tables to hive. You should be familiar with simple operations
    like concat, substr etc…

  5. I didn’t use any args for spark-shell ( I only used --packages … for avro) and I all tasks were solved with using spark dataframe api and spark sql

Prepare for certifications on our state of the art labs which have Hadoop, Spark, Kafka, Hive and other Big Data technologies

  • Click here for signing up for our state of the art 13 node Hadoop and Spark Cluster



I am planing to take exam as well, will it be a good idea to connect over a phone (+44 7417428785)

Hey thanks for the information,

Can we use notepad to write command and copy paste to terminal during exam?

yes, sure, You can use all installed software

When you write data frame to text file, did you convert them to rdd then use saveAsTextFile method?
Can we use df.write.csv() method instead?


Hi Liangbin_Chen,
I am CCA 175 certified, no need to convert to rdd. You can use df.write.format(“csv”).

Hi Ravikiran, Do we need always store the data with header while writing data into csv/text file ?

Hello Ravi,

quick check

  1. do we need to save the code also in some directory?
  2. What is the original test level , easy/medium/hard.
  3. when saving file in TEXT format , did you use CSV or text option.

Thanks again,

Hi Arun, You have to use the header according to the expected output.

1 Like

Hi Anubhav,

  1. No need to save the code, only result matters.
  2. If you practice well it will be medium level.
  3. I used csv for the text format, because text option saves result as single column.
    Hope this helps in ur preparation, good day.
1 Like

Thank you Buddy . This is helpful :slight_smile: