Is there any way to run .hql or .hive file using Pyspark ? If there is a way, could any one help me how to do that. I know using spark-sql we can execute .hql files by using -i or -f options

pyspark

#1

Hi, I am trying to figured out the way to execute .hql or .hive files using pyspark. My query is very large.I wrote the program to read the file and passing those statements through .sql Data frame.

I am looking for a way to execute the file itself instead of reading the file and parsing those. Is any one know how to do that?

Thank you,
Sudheer


#2

I am looking for the solution same using scala. Please help if any one tried.
When i tried facing below error :
scala :
scala> import scala.io.Source
import scala.io.Source

scala> val filename = “/home/rameshrajach/account_qry.hql”
filename: String = /home/rameshrajach/account_qry.hql

scala> for (line <- Source.fromFile(filename).getLines) {
| spark.sql(line)
| }

Error :
org.apache.spark.sql.catalyst.parser.ParseException:
extraneous input ‘;’ expecting (line 1, pos 29)

== SQL ==
select * from rramesh.account;
-----------------------------^^^

at org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:239)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:115)
at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:48)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:69)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:638)
at $anonfun$1.apply(:40)
at $anonfun$1.apply(:39)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
… 57 elided