CCA 175 :- 9/9 (Scored :- 100%). Finished my exam 15 mins earlier

#1

All,

I have cleared my CCA-175 exam on 7/24/2019 . Scored 100%.

Question pattern :- Sqoop Import -1 , Sqoop export :- 1 , Hive :- 1, Spark :- 6

Preparation :- 1) practiced no. of times ITversity Practice test (Udemy ), Arun’s Blog and Navdeep Kaur practice test(Udemy)
2) Used Dataframe method to solve all spark related problem.
3) Spent 5 mins to read all questions and ordered the questions to attempt(easy question first).

Few tips :- 1) Master in basic Transformation like concat , substring , concat_ws , date formatting , regexp_replace ,translate etc
2) memorize all compression algorithm , so that you can save time to search core-site.xml get compression algorithm details from it.
3) Sqoop Import /export question was pretty straight forward.
4) Took exam on weekday(Wednesday) thought to avoid cluster slowness during weekend . During my exam cluster response was good.
5) master in one file type conversion other file type (refer Aurn’s blog problem#-4 )

Special thanks to “ITVersity” and all members who have shared their exam experience here that information helped me lot to plan my exam better.

Thanks to Sofia and Soumyajit_De , they helped me to clarify my doubts.

I am happy to help , if you have any question to clarify.

Prasant :slight_smile:


Prepare for certifications on our state of the art labs which have Hadoop, Spark, Kafka, Hive and other Big Data technologies

  • Click here for signing up for our state of the art 13 node Hadoop and Spark Cluster

3 Likes

#2

Hi Prasant,

How were you able to convert from RDD to DF for the questions where they didn’t give you the schema of the table?

0 Likes

#3

Ifraz,

In exam , schema details will given , you can convert RDD to DF.

As per your question. if there is no schema , you have to view the data create your own schema.

0 Likes

#4

Congratulation Prasant!

I have one question about the data format.
I failed with my exam yesterday, there are two questions failed because of "Records contain incorrect data ". I was wondering, if the reason is because that I save the result in CSV format with required delimiter, but not the Txt format.
My question is: if we are asked to save the result in text file with some delimiter, do we need to save as a text file or can we save also as csv file (it’s kind of plain text file)?

best regards
JJ

0 Likes

#5

You need to use data.saveAsTextFile(“problem3/location”)
do not include .csv at the end just save in the directory

0 Likes

#6

View the data in mysql where you can see the schema?
or Spark and try to predict what the data type for each column would be for Ex price = double ?

Also thank you for replying I appreciate it

0 Likes

#7

Hello JJ,

if question asked text file with pipe ("|") delimiter but you stored it csv file(csv :- comma separated value , delimiter is comma (",")) then it’s wrong .

0 Likes

#8

The question is like: save the result as text file with tab as delimiter.
I used:
df.write.format(“csv”).operation(“sep”, “\t”).save("\some\path")
I’m not sure it is right or not. In the practice excises of itversity in udemey, this is the way to save the result as text file.

0 Likes

#9

Df.rdd.map( f => f.mkString("\t")).saveAsTextFile("\your_path")

This will work. By this way we can give whatever delimiter we want and save it as text file.

0 Likes

#10

Thannk You Kanna !

yes this syntax will work . Df.rdd.map( f => f.mkString("\t")).saveAsTextFile("\your_path") will load text file with tab delimited .

JJ :- I have not tested this syntax df.write.format(“csv”).operation(“sep”, “\t”).save("\some\path") . verify

0 Likes

#11

Can you please answer my question on what to do if there is no Schema given. .

Do you view the data in mySQL or in HDFS using -cat ?

Any help will be greatly appreciated.

Thank you in advance :slight_smile:

0 Likes

#12

Ifraz,

Use HDFS fs -cat | head -10 command to view the data. then prepare your schema.I assume you are talking about text file.

if it’s Avro and parquet file , both files will have schema details as part of the data file. Just read it like below if you are using spark 1.6

val df = sqlContext.read.parquet/avro(" ")

df.printSchema

It will give you schema details .

0 Likes

#13

Hi - Few question, please reply

  1. Where did you give your exam at home with normal internet speed or at office with higher bandwidth speed?

  2. Was there any lag or slowness in the VM while giving the exam, was it same as compared to using itversity from labs?

  3. Did you get any link or to login to the exam terminal before the exam? Or should we log on to any websites with user credentials?

  4. Any laptop config check done before start of the exam?

I am planning to give my exam next week Tuesday, was unclear on the above questions so far on the stories I have read till now.

Thanks
Shankar.V

0 Likes

#14

Hello Venkat,

find my replies inline.

  1. Where did you give your exam at home with normal internet speed or at office with higher bandwidth speed?
    [ps] :- I took exam at home, and internet speed 75Mbs . Better to have higher speed bandwidth.

  2. Was there any lag or slowness in the VM while giving the exam, was it same as compared to using itversity from labs?
    [ps] :- If you compare ITversity labs and cloudera exam VM , I felt cloudera VM is bit slow.

3)Did you get any link or to login to the exam terminal before the exam? Or should we log on to any websites with user credentials?
[ps] :- you have to login https://www.examslocal.com/ for the exam.

  1. Any laptop config check done before start of the exam?
    [ps] :- Before exam do compatibility check of browser , internet speed etc… there is a compatibility check option is available in exam portal.

prasant

0 Likes

#15

Thanks much Mate! Appreciate your timely response. Cheers

0 Likes

#16

Hi Prasant,

May you please help me with saving dataframe to text file with specific delimiter using pyspark2 i am facing issue with unicode error .

below is my code i ran to save dataframe to text file.

h1b.
repartition(1).
rdd.map(lambda line: “|”.join([str(x) for x in line])).
saveAsTextFile(’/user/user_folder/data/TAug1’)

job is failing saying Unicode encode error.

UnicodeEncodeError: ‘ascii’ codec can’t encode character u’\xc9’

I am afraid i seen couple of post and non of them have answer to it. I kindly request you to suggest me a proper way of handling this error.

0 Likes

#17

Hi Sudheer,

I have given exam using Scala .
Please check with other members in the group who have given exam using Python.

Thanks
prasant

0 Likes

#18

you are welcome… All the best for your Exam Shankar!!

0 Likes

#20

hi Prasant,

Thank you for your reply.

I was wondering did you or any one to you knowledge got question to write data into a sequence file ? if so while using Dataframe how did you save them into Sequence file.

Many Thanks,
Sudheer.

0 Likes

#21

Sudheer,

I did not get single question to write into sequence file , pls check with others in this group.

prasant…

0 Likes