How to insert spark structured streaming data frame to HIVE table

spark-streaming

#1

I have tried to do some examples of spark structured streaming.

here is my example

val spark =SparkSession.builder().appName(“StatsAnalyzer”).
enableHiveSupport().config(“hive.exec.dynamic.partition”, “true”).config(“hive.exec.dynamic.partition.mode”, “nonstrict”).
config(“spark.sql.streaming.checkpointLocation”, “hdfs://pp/apps/hive/warehouse/dev01_landing_initial_area.db”).
getOrCreate()

// Register the dataframe as a Hive table

val userSchema = new StructType().add(“name”, “string”).add(“age”, “integer”)
val csvDF = spark.readStream.option(“sep”, “,”).schema(userSchema).csv(“file:///home/su/testdelta”)
csvDF.createOrReplaceTempView(“updates”)
val query= spark.sql(“select * from updates”)

query.writeStream.outputMode(“append”).partitionBy(“age”).format(“csv”).option(“path”, “hdfs://pp/apps/hive/warehouse/dev01_landing_area.db/stats2”).start()

As you can see in the last step while writing data-frame to hdfs location, it is not throwing any error but the data is not getting inserted into the exciting directory.(my existing directory having some old data partitioned by “age”).

Can you help why i am not able to insert data in to existing directory in hdfs location ? or is there any other way that i can do "insert into " operation on hive table ? please help