Cleared CCA175 with 8/9 score on 10/20

Cleared CCA175 with 8/9 score. Big thanks to this blog, Durga and Arun for providing guidance, content and sharing knowledge…
Similar to many others, I got 2 sqoop and 7 spark related questions… questions were quite simple.
Arun’s problems on file formats, compressions helped alot. Practicing is very important without which I think time management in exam would get difficult. Practicing Durga’s scenario’s during videos and Arun’s problems should be good enough for attending exam, but I followed few online questions also just to improve my interpreting skills.
Found few good questions in below link:
https://github.com/HadoopThoughtworks/Materials (CCA175.docx)

Results were sent to my email address after 5- 6 hours, although in general it should just take couple of hours.

Congrats Santosh… awesome…

1 Like

Congratulations bro…!!!
Haven’t you faced problem with compression techniques in exam Cluster like snappy compression in text file (usually VM 's will fail to do this)
I have a few question regarding exam. Could you give me your Contact.

Congratulations santosh

1 Like

Congrats @santosha7 ,

I have three questions;
1-) Are we going to responsible with compression file using spark?
2-) Can we use any of RDD, data frame or spark sql when we have spark related questions? ( for example; we get 7 question from spark, can we solve all of them with spark sql?)
3-) When we read avro,sequence, or any other file formats, do we need to import packages during the exam?

Thanks,
Taner

Congratulations !! Please tell one thing do we need both python and Scala for certification or just Scala is enough

Hi,

Even I have same issue with the cloudera VM version I am using, as per few of the blogs snappy compression on text file should work fine, I was hoping it would work during certification, but luckily didn’t face that kind of question.

Hi Taner,

  1. Yes, we will be responsible to read/write using required compression technique. They would be mentioning the kind of compression used on file if you have to read it, or the compression you have to apply while writing it to HDFS.
  2. Yes, we can use either of the options to solve the problem, the output needs to be in the format they would mention in problem.
  3. Yes, we will be responsible to import required packages.

As per the certification requirements it is mentioned that you have to know both Python and Scala, but going through different success stories in this blog either of one should be good enough to face the exam. I know only Scala.

Hi Santosh, how time is needed for preparation given that I only know Java and I an new to everything in this course.

Hello,

Congratulations!
Will they provide MySQL credentials ? In question? Or where ?

Hi Gaurav,

As per my knowledge it should take 2 to 3 months to learn everything and appear for certification, considering a new bee to these technologies. But again learning skill differs from person to person, you might get these much quicker than others.

Hi Sai,

Yes the database connection details and credentials will be clearly mentioned in every question wherever mysql is required.

Hi Santosh,

Congratulations for your certification.

Could you please confirm about the different libraries which we require to import in the Certifications.
And also If I skip preparing for Spark Streaming and Flume…will it be ok?

Regards,
Varun Dewan

Congrats Santosh !!!

Few Queries

  1. For Spark questions, do we have to write Scala/Python code and create jar file, then spark-submit on cluster OR writing and executing in REPL will do ?
  2. I heard shell scripts are provided during certification to execute spark code, pls clarify.
  3. For Spark questions, I heard code templates are provided. Are they only in Scala or Python OR mix of both ? Basically if a candidate does not know one language, will there be any problem in understanding the template ?

Thanks in advance :slight_smile:

Can someone explain what this regex do
(?=(?:[^"]"[^"]")[^"]$)