Cleared CCA175 on 02-Oct

Cleared CCA175 on 02-Oct
0.0 0


I’ve cleared CCA175 with 8/9 today. A big thanks to Durga and @itversity team for the wonderful course content and the timely support they provided for lab issues.
I would definetly say the latest playlist is just enough to clear the exam. Infact Durga has covered in depth most of the concepts.

  • Exam pattern was no different from what other’s have mentioned in this forum.

  • Focus was more on delimiters, file I/O format. All transformations were at ‘easy’ complexity level. I was able to complete the exam in 1 hr and 40 minutes and used last 20 minutes for verifying the solution.

  • I would strongly recommend the test takers to practice in CDH VM environment sufficiently before taking the exam.
    I found a bit difficult to get used to the exam interface as fonts were too small and the remote machine was bit slow.

  • Time management is the key and you will be able to make it ONLY if you had enough hands-on practice.

  • Do not depend on online documentation, I don’t think you will have time for that during exam.

  • I also practiced problems from Arun’s blog. If you are able to solve them without referring any documentation, I my opinion, you can clear the exam easily.

  • Got the score card within 1 hr from the time of submission.


Congratulation @BalaGanesh for earning industries most valuable certification in big data industry.


Heartiest congratultion @BalaGanesh


Hello Did u get the certificate yet ? i


Not yet. The email from cloudera mentioned that it takes 2-3 working days. So probably I expect today.


congratulations @BalaGanesh. I just want to know if i can solve all the spark questions with spark sql or there may be a situation we have to use rdd’s?


There might be situation where rdd will be easier than spark sql and vice-versa. That you need to see the situation and decide within limited time frame. No one can give you that answer but your understanding and time you’ve at exam.


I agree with @dsu_barroha. It depends on the quesutions you get. But if you have decided to go only with Spark SQL then I suggest that you stick with that approach in the exam. If you start thinking on which approach to use in the exam, then you might end up confusing yourself. I had a similar situation :slight_smile: If you plan to use Spark SQL and if you are required to use a text file as source, then you create the DF without a case class and that might save some time. I used this approach in my exam.


@BalaGanesh Thank you.But for spark sql we have to create case class right?for example if there is data frame with _1 ,_2,_3 created without case class.Now i register this in temp if i query sqlContext.sql(“select * from tablename where _1 <100”), this is not working.


Hi. I have also completed the certification on 2 nd. Have you got the certificate?


Hi @Vignesh_Raja, Yes I got the certificate on 05th October. I wrote to certification support team in cloudera and recieved the email in an hour.


@Subhashini_Balu, Below is an example of how to form the DF for orders data without case class. (This code snippet can still be optimized though).

val ordersDF = sc.textFile("/public/retail_db/orders/").
map(rec => (rec.split(',')(0).toInt,


Try this and let me know if you have any questions.



Congratulations @BalaGanesh and @Vignesh_Raja .

Anyone of you needed to use Hue for creating files in HDFS in CDH?


@Jyutika_Kathe I did not use Hue in the exam and I don’t think you will need it.


Thanks @BalaGanesh

One more question.

Is the following output same or different? Will Cloudera mark both as correct or how?




There is just a difference of brackets around the records.


Congrats @BalaGanesh
I just want to know if there is any questions related to Spark Streaming…??


@Jyutika_Kathe They are different. If you write the tuple in the O/P file as it is you get the value surrounded by brackets (I believe you have not formed a string in this case). The second is what you get when you retrieve the tuple elements and write a comma seperated String.

val orderSource01 = orderSource.filter(rec => (rec.split(',')(3).toString == "CLOSED")).map(rec => (rec.split(',')(0),rec.split(',')(3)))

When you write orderSource01 to a O/P text file, you will get:


val orderSource01 = orderSource.filter(rec => (rec.split(',')(3).toString == "CLOSED")).map(rec => (rec.split(',')(0),rec.split(',')(3))).map(rec => (rec._1+","+rec._2))

When you write orderSource01 with the addition transformation above to a O/P text file, you will get:


In the exam, O/P format will be clearly mentioned. So you don’t have to worry about that. But you might have to be careful with the ordering requirements in the O/P. I mean which field should be written first in the result.

CCA175 - Clarification on the output format

@Yeswanth No I didn’t get any question on spark streaming.


Thanks a lot @BalaGanesh for this clarifications.


@BalaGanesh could you please provide me your email id if you don’t mind? I have my exams on coming Saturday and have some questions regarding it.