java.lang.NumberFormatException: while doing df.show()

apache-spark
scala

#1

I am getting java.lang.NumberFormatException: For input string: “Product_id” and “Product_Price”

Dataset:
Product_id,Product_name,Product_Category,Product_Brand,Product_Price
1,KitKat,Chocolate,Nestle,$2
2,Diet Coke,Carbonated Drink,Coca-Cola,Rs.20
3,Lays Chips,Snacks PepsiCo,GBP15

Created RDD:
val productsRDD = sc.textFile("/user/mraheemabdul/datasets/product.txt")

Created DF:
val productsDF = productsRDD.map(products => {(products.split(",")(0).toInt, products.split(",")(1), products.split(",")(2), products.split(",")(3), products.split(",")(4).toFloat)}).toDF(“Product_id”, “Product_name”, “Product_Category”, “Product_Brand”, “Product_Price”)

When i am doing productsDF.show() the java.lang.NumberFormatException occurs.

How can i fix this?


#2

Is anyone able to look into this?


#3

In this line, there is only four filed and above two are five fields. can you check the file once?


#4

3,Lays Chips,Snacks,PepsiCo,GBP15

this is the correct record. i tired with it too


#5

@mraheem_hadoop You have to create a case class

case class products(Product_id:String, Product_name:String, Product_Category:String, Product_Brand:String ,Product_Price:String)

Then create data frame


#6

Ok here is what i am doing…

Created Case Class

case class products(Product_id:String, Product_name:String, Product_Category:String, Product_Brand:String ,Product_Price:String)

Created RDD
val productsRDD = sc.textFile("/user/mraheemabdul/datasets/product.txt")

Creating DF from above RDD
val productsDF = productsRDD.map(products => {(products.split(",")(0).toInt, products.split(",")(1), products.split(",")(2), products.split(",")(3), products.split(",")(4).toFloat)}).toDF(“Product_id”, “Product_name”, “Product_Category”, “Product_Brand”, “Product_Price”)

Doing show()
productsDF.show()
“I Get the NumberFormat Exception Error”
productsDF.registerTempTable(“product”)

Where am i going wrong?


#7

@mraheem_hadoop Here is code which I tried

val data = sc.textFile("/user/annapurnachinta/datset")
val header = data.first()
val data1 = data.filter(row => row != header)
case class products(Product_id:String, Product_name:String, Product_Category:String, Product_Brand:String ,Product_Price:String)
val productsDF = data1.map(products => {(products.split(",")(0).toInt, products.split(",")(1), products.split(",")(2), products.split(",")(3), products.split(",")(4).toString)}).toDF(“Product_id”, “Product_name”, “Product_Category”, “Product_Brand”, “Product_Price”)
productsDF.show()

output: