Cleared CCA 175 With 100% Score ! May 8th 2019

Hi Swetha - Thank you! Happy to help!

For practicals, you can do the Practice problems/solutions module (5/6 qs I think) in the Itversity course and also the problems in the videos. In addition in udemy there are more practice problems that itversity has launched - I think there are 3 or 4 sets (called CCA 175 Practice Tests) - I did those multiple times. You can get the coupons for those too from this site.

@Sofia Thank you so much!

@Sofia Would you be able to share links? I tried searching couldn’t find. Thanks!

@Sofia Thanks much Sofia

Hi Veena,
Here is the link for the ITversity Practice Tests on Udemy - link

Thanks for a great answer.Sorry it took me a while to acknowledge

Thanks Sofia… Sorry for the late response.

Hi Sofia !

If you remember, we communicated earlier when my exam got cancelled on 2nd May. I was supposed to give the exam again today (4th June). But again the exam did not launch saying exam is not ready.
I am right now feeling so very frustrated and disappointed.
Not sure, why cloudera is not taking these issues seriously.

Hi Sofia,

One more question from me .
Let’s say , when we import data from a table in avro format by using any compression and there is a date field in table, so in imported avro file it comes as BIGINT. We have seen this in Arun’s blog. So while we save this avro file in any other format for example text or parquet , do we need to change this date BIGINT format in date field like (yyyy-mm-dd) in certification exam or we can take this field as it is while saving in other file formats.

Please suggest.


  1. How do you log into spark2-shell in cloudera environment in cca175 exam?
    Is it the same way as in labs or do we need to mention the full path of spark2/bin/spark-shell?

Please provide me any sample command to log into spark2 shell in cloudera

  1. How do i work with avro files in spark2 environment?

which packages to import during spark2 initialization in cca 175 exam

3)Do we need to set some configuration before starting the exam in spark2 shell ?

Have practiced to not in spark2 not in the spark(1.6)…please help

do we need to remove the duplicates if they don’t ask. one of the dgadiraju video… they are removing the duplicates, is that mandatory.

nyse_data - find the stockticker which is not existing in the nyse meta data…

@Sanchit_Kumar did you take the exam yet ? did you try pyspark2 to launch ?

@akalita Nope I havent taken the test yet but you can open it by simply using the command: pyspark2 --master --packages <package_name>

Hi @Sanchit_Kumar, did you take the exam? I have planned for next week. I have confusion in spark writing data in different format with compression for orc,parquet,avro and json. If you have the solution please help.

Could you please confirm that they provide connection details for Sqoop. In CCA 159, I got two questions on Sqoop and it was mentioned the mysql database was on Gateway node and Cloudera as username and password. I was really struck with that, I tried listing tables using connection as jdbc:mysql://localhost and it returned the expected results and I can see the tables. But as soon as I start using import then I’ve faced “Connection Refused” error… really upset with that. Not sure whether port number is mandatory for import but not for list-tables…

Hello Pattem,

Gateway itself is the server name , instead of local host you should use “Gateway” for server name ? you can refer below link from cloudera for sqoop import example.

Hi Sofia,


I have two question:

  1. you said " Know the compression classes by heart (for both Sqoop import compressions and Spark saving of dataframes/RDDs). There is no time to go search for documentation."
    which classes do we need to keep in mind?
    To save the Dataframes with compression, can we just use like : df.write.format(“parquet”).option(“compression”, “snappy”).save(xxxx)
    here the “snappy” can be replaced by one of the fowllwing case-insensitive shorten names: (none, bzip2, gzip, lz4, snappy and deflate)

  2. If they ask us to save the result in text file with some delimiter, can we use like
    df.write.format(“csv”).option(“sep”, “\t”).save(xxxx)
    to save the result in csv file? ( csv is some kind of plain text file format as well.) Or do we save them exactly in txt file?

thanks and looking forward for you answer

best regards

Hi Sofia, Quick questions:

  1. Where did you give your exam at home with normal internet speed or at office with higher bandwidth speed?

  2. Was there any lag or slowness in the VM while giving the exam, was it same as compared to using itversity from labs?

  3. Did you get any link or to login to the exam terminal before the exam? Or should we log on to any websites with user credentials?

  4. Any laptop config check done before start of the exam?

I am planning to give my exam next week Tuesday, was unclear on the above questions so far on the stories I have read till now.


Hi Sofia,

I ma planning to give exam this weekend and I want to know what types of question are coming in exams. Specially in Spark side.

Hi Sofia,

How optimized is the Exam environment ?

I have a question regarding num-exectors ,–executor cores and executor memory ??

Are they already optimized for best results or do we have to do any changes to it ?

Did you do any changes during Exam ?