Error in trying to use hivecontext from pyspark

Getting error when trying to access a sql like select count(*) from testing table which is in the default scheme.

I am using cloudera CDC 5.8. When trying to copy or create a soft link to hive-site.xml to sparks conf directory it gives me error, permission denied. I tried doing su cloudera, that also did not helped.

from error stack it looks like it tries to connect to Derby do access meta store and it fails in doing so.

Please suggest what else can I do.


When running the below command in big data lab. causing exeption

data= sqlContext.sql(“select * from departments”)

even i am facing the same issue…any solutions around this ?

Hive is located in a separate directory, so try the following. And also when you import data into Hive using sqoop don’t use --compression-codec “SnappyCodec” which is not supported by pyspark HiveContext.

sudo ln -s /etc/hive/conf/hive-site.xml /etc/spark/conf/hive-site.xml