Cleared CCA 175 on Aug 11


#1

Cleared CCA 175 on Aug 11. I got 9 questions and got 8 out of 9. 9 questions had two sqoop and seven spark questions. I would like to thank Durga sir for his course and materials. Practice is key and I was able to complete with 10 minutes to spare and was able to verify most of my answers. Itversity labs and simulator was very helpful!


Prepare for certifications on our state of the art labs which have Hadoop, Spark, Kafka, Hive and other Big Data technologies

  • Click here for signing up for our state of the art 13 node Hadoop and Spark Cluster


#2

Congrats Karthik. Did you get any questions on spark streaming ?


#3

No spark streaming questions for me


#4

Hi karthik, I was wondering if you practiced using Scala or Python?


#5

I practised in Scala


#6

thank you! do get option during test to choose language or in certain questions do we have to use scala/python?


#7

no option to choose. in fact they only care about output. not the code. So you can use whichever language you prefer


#8

Had worst experience with my exam today. Had screen hanging issues multiple times which wasted nearly 25 minutes of time. No information about avro package location was shared, Tried to import using import avro.schema but this doesnot work. The examiner was very reluctant to share the avro package location in cloudera. Did anyone face avro questions ? how did you import avro packages in exam


#9

Sorry you had bad experience. Proctor is not supposed to provide you with this information. He/She is from 3rd party and does not understand these technical details.

For spark-shell you need to run this command:
import com.databricks.spark.avro._

For pyspark you need to run this:
sqlContext.read.format(“com.databricks.spark.avro”).load(“filePath”)
df.write.format(“com.databricks.spark.avro”).save(“output dir”)

This information is available here:


#10

Sorry you had a bad experience. For CCA 175, avro package is available by default for spark-shell. You just need to do

import com.databricks.spark.avro._

and then work with avro packages


#11

Thanks Karthik. I used both import avro.schema ,import com.databricks.spark.avro._ in pyspark -shell it gave me no module found issue


#12

Thank you Mayank. How to do the same in Pyspark shell


#13

Please check my post above


#14

Hi Mayank, Tried using the command like you said

sqlContext.read.format(“com.databricks.spark.avro”).load("/user/krishmani/arun/problem5/avro")

: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.avro. Please use Spark package http://spark-packages.org/package/databricks/spark-avro
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:72)
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
at py4j.Gateway.invoke(Gateway.java:259)