Spark2-shell save data in avro file Error

Hi Team,

I am trying to save the data as avro file. I am using spark2-shell. Below is the code.

spark2-shell --packages com.databricks:spark-avro_2.11:3.2.0

import com.databricks.spark.avro._

val words = sc.textFile("/public/randomtextwriter")
val word = words.flatMap(x => x.split(" “)).map(x => (x,1)).reduceByKey((total,value) => total+value , 8).toDF()
word.write.format(“avro”).save(”/user/kishoresoft/solutions/solution05/wordcount")

i am getting Error

org.apache.spark.sql.AnalysisException: Failed to find data source: avro. Please find an Avro package at http://spark.apache.org/third-party-projects.html;

Can anyone help me in this. Thanks!

try replacing avro with full package name: com.databricks.spark.avro

I have a question about avro file in the certiication exam. Do we need to write the full name (com.databricks.spark.avro) in DF.write.format(" ") or we can just write DF.write.format(“avro”) during the exam? Thanks!

Hi Nishikant,

I am tried with full package name. below the code is

val prds = sc.textFile("/user/kishoresoft/problem2/products/")
val products = prds.map(x => (x.split(’|’)(0).toInt,x.split(’|’)(1).toInt,x.split(’|’)(4).toFloat)).toDF(“product_id”,“category_id”,“product_price”)

val df = products.filter(“product_price < 100”).groupBy(col(“category_id”)).agg(max(col(“product_price”)).alias(“max_price”),min(col(“product_price”)).alias(“min_price”),round(avg(col(“product_price”)),2).alias(“avg_price”), countDistinct(col(“product_id”)).alias(“total_products”)).orderBy(col(“category_id”))

import com.databricks.spark.avro._;

spark.sqlContext.setConf(“spark.sql.avro.compression.codec”,“snappy”)

df.coalesce(1).write.format(“com.databricks.spark.avro”).save("/user/kishoresoft/problem2/products/result-df")

i getting the below Error

is there any changes i need to do ?

Thanks!