I am using the below command while launching pyspark to work with reading / writing files in “avro” format.
pyspark2 --master yarn --conf spark.ui.port=12709 --packages com.databricks:spark-avro_2.11:4.0.0
In the certification, will the same package be available if a question specific to avro comes up ? If not, where can I find the configuration in spark that will help me identify which “avro” package to use ?
Learn Spark 1.6.x or Spark 2.x on our state of the art big data labs
- Click here for access to state of the art 13 node Hadoop and Spark Cluster