CCA175 Exam updated with New syllabus


CCA175 exam has been updated with new syllabus.

1 - sqoop question (Import and Export)
8 - Spark questions
No question on Flume/Kafka

No template will be provided. Need to code from scratch. Learn all file types (Parquet,JSON,csv,Hive table,avro etc) to read and write with various compression codecs.

You can code in any language (scala/python).

Practice coding as much as possible to be able to attempt all 8 spark questions in 2 hours.


@HadoopLearner Have you written the exam recently?

@HadoopLearner Thanks for the info. Have you taken the exam? Or where did you get to know the above info?

Hi @HadoopLearner,

Thank you for the update. Its so useful info. for all.
I am planning to give the exam in May end or max by June.
It would be good if you appreciated if you provide your view on below questions:

1.Were all 8 questions on Spark only for read/write operations?
2.Were there any questions on aggregation commands in Spark?

Please elaborate if possible more on Types and level of questions asked :slight_smile:

Best Regards,
Nitesh Chainani

Hi @HadoopLearner

Thanks for the info.

Did you write exam?

is there any way to read write using avro or parquet in spark as you mentioned in first post?

I have given CCA175 was not to clear .I got 9 questions.1 sqoop import 1 sqoop export and 7 from spark.
No code template was given we need to write the code to get desired output.
The questions were like reading data in different avro or json or parquet format from hdfs or hive metastore and storing the results in hdfs in a different format with particular compression codec .
No questions were given on real or near real time processing and configuration changes for me.
The questions were a bit tough but more and more practice can get you through.


Can you share your email address ?

Hi @sammy12

So , you have used Spark SQL for the questions on Hive Metastore.Do you rememeber the split of Core Spark and Spark SQL



Do not discuss questions in this forum!!!

1 Like

I am particularly sure about the split ,2 of the questions are tab separated which can be done by core api.I am not sure data in other formats like json ,avro and parquet can be done by core .

Wow … thts crazy, writing all code without template in 2 hours ? people facing hardtime finishing with templates in that kind of screen. maybe hortonworks cert is better to go for. and 8 questions from spark, they should call it a spark certification.

1 Like

I have created a new mail
Any one having any questions can shoot me a mail.Hope I can help.

1 Like

It is the same even in hortonworks certification. I was not able to clear the exam. It is hard to finish all the question in stipulated time of 2:15 hours. Even user guide was not provided and tag popup also not avaiable. 5 was required out of 8 to pass.

Hi @itversity I was studying according to the old youtube playlist of 92 videos and now I got to know about the syllabus change. I saw that the itversity youtube channel has a new playlist as per the new revised syllabus. I had already studied HDFS commands, Sqoop and Flume as per the old playlist. Should I go through with them again in the new playlist?


If you are enough practiced and comfortable on those topics then you don’t need to.

@sammy12 Can you please tell when u have given exam?
How many questions are required to pass ?

thanks a lot for your information,
but it looks like your email is incorrect.
Could you provide the correct email address please?

@itversity and @sammy12 and others -

I have gone through almost 50 videos of the 92 videos playlist on youtube by itversity for preparation of CCA 175 ( i.e. the old playlist ). Could someone please tell me which videos from the new playlist I could skip so that I do not have to rewatch or spend my time going over the things I have previously done. I have completed everything with ingestion and pyspark. ( if that description helps ).

Please need help. I am kind of in a time crunch situation and it will mean a lot to save some precious time here. Thanks

Hello everyone,

maybe someone has answers to any of these questions?

  1. Do we need to code all answers to question from scratch without any template?
  2. What kind of docs provided in a Cloudera cluster (and is it useful to use them during certification in your opinion.)
  3. What kind of questions are usually asked about sqoop/flume?
  4. Are all question about spark based on reading/writing data from/to hdfs with or without some compression codec? (and no question about aggregating/filtering/analyzing data?)
  5. What IDE is able to use during the exam? (For example, where can we write Spark code?)

I would be kindly appreciated,
thank you.