I tried this
from pyspark.sql import HiveContext
sqlContext = HiveContext(sc)
depts = sqlContext.sql(“select * from departments”)
and i got this
17/01/25 12:28:50 INFO hive.HiveContext: Initializing execution hive, version 1.1.0
17/01/25 12:28:50 INFO client.ClientWrapper: Inspected Hadoop version: 2.6.0-cdh5.8.0
17/01/25 12:28:50 INFO client.ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0-cdh5.8.0
17/01/25 12:28:50 INFO metastore.HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
17/01/25 12:28:50 INFO metastore.ObjectStore: ObjectStore, initialize called
17/01/25 12:28:51 INFO DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored
17/01/25 12:28:51 INFO DataNucleus.Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
17/01/25 12:28:54 INFO metastore.ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
17/01/25 12:28:56 INFO DataNucleus.Datastore: The class “org.apache.hadoop.hive.metastore.model.MFieldSchema” is tagged as “embedded-only” so does not have its own datastore table.
17/01/25 12:28:56 INFO DataNucleus.Datastore: The class “org.apache.hadoop.hive.metastore.model.MOrder” is tagged as “embedded-only” so does not have its own datastore table.
17/01/25 12:28:58 INFO DataNucleus.Datastore: The class “org.apache.hadoop.hive.metastore.model.MFieldSchema” is tagged as “embedded-only” so does not have its own datastore table.
17/01/25 12:28:58 INFO DataNucleus.Datastore: The class “org.apache.hadoop.hive.metastore.model.MOrder” is tagged as “embedded-only” so does not have its own datastore table.
17/01/25 12:28:58 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
17/01/25 12:28:58 INFO metastore.ObjectStore: Initialized ObjectStore
17/01/25 12:28:59 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.1.0
17/01/25 12:28:59 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException
17/01/25 12:28:59 INFO metastore.HiveMetaStore: Added admin role in metastore
17/01/25 12:28:59 INFO metastore.HiveMetaStore: Added public role in metastore
17/01/25 12:28:59 INFO metastore.HiveMetaStore: No user is added in admin role, since config is empty
17/01/25 12:28:59 INFO log.PerfLogger:
17/01/25 12:28:59 INFO metastore.HiveMetaStore: 0: get_all_functions
17/01/25 12:28:59 INFO HiveMetaStore.audit: ugi=cloudera ip=unknown-ip-addr cmd=get_all_functions
17/01/25 12:28:59 INFO DataNucleus.Datastore: The class “org.apache.hadoop.hive.metastore.model.MResourceUri” is tagged as “embedded-only” so does not have its own datastore table.
17/01/25 12:28:59 INFO log.PerfLogger: </PERFLOG method=get_all_functions start=1485376139671 end=1485376139821 duration=150 from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=0 retryCount=0 error=false>
17/01/25 12:29:00 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
17/01/25 12:29:00 INFO session.SessionState: Created HDFS directory: file:/tmp/spark-8ac42377-1661-434d-88a4-e819106e3a92/scratch/cloudera
17/01/25 12:29:00 INFO session.SessionState: Created local directory: /tmp/f167374a-8fb7-484a-b360-66c31a15554f_resources
17/01/25 12:29:00 INFO session.SessionState: Created HDFS directory: file:/tmp/spark-8ac42377-1661-434d-88a4-e819106e3a92/scratch/cloudera/f167374a-8fb7-484a-b360-66c31a15554f
17/01/25 12:29:00 INFO session.SessionState: Created local directory: /tmp/cloudera/f167374a-8fb7-484a-b360-66c31a15554f
17/01/25 12:29:00 INFO session.SessionState: Created HDFS directory: file:/tmp/spark-8ac42377-1661-434d-88a4-e819106e3a92/scratch/cloudera/f167374a-8fb7-484a-b360-66c31a15554f/_tmp_space.db
17/01/25 12:29:00 INFO session.SessionState: No Tez session required at this point. hive.execution.engine=mr.
17/01/25 12:29:01 INFO hive.HiveContext: default warehouse location is /user/hive/warehouse
17/01/25 12:29:01 INFO hive.HiveContext: Initializing metastore client version 1.1.0 using Spark classes.
17/01/25 12:29:01 INFO client.ClientWrapper: Inspected Hadoop version: 2.6.0-cdh5.8.0
17/01/25 12:29:01 INFO client.ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0-cdh5.8.0
17/01/25 12:29:03 INFO metastore.HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
17/01/25 12:29:03 INFO metastore.ObjectStore: ObjectStore, initialize called
17/01/25 12:29:04 INFO DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored
17/01/25 12:29:04 INFO DataNucleus.Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
17/01/25 12:29:07 INFO metastore.ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
17/01/25 12:29:10 INFO DataNucleus.Datastore: The class “org.apache.hadoop.hive.metastore.model.MFieldSchema” is tagged as “embedded-only” so does not have its own datastore table.
17/01/25 12:29:10 INFO DataNucleus.Datastore: The class “org.apache.hadoop.hive.metastore.model.MOrder” is tagged as “embedded-only” so does not have its own datastore table.
17/01/25 12:29:11 INFO DataNucleus.Datastore: The class “org.apache.hadoop.hive.metastore.model.MFieldSchema” is tagged as “embedded-only” so does not have its own datastore table.
17/01/25 12:29:11 INFO DataNucleus.Datastore: The class “org.apache.hadoop.hive.metastore.model.MOrder” is tagged as “embedded-only” so does not have its own datastore table.
17/01/25 12:29:11 INFO DataNucleus.Query: Reading in results for query “org.datanucleus.store.rdbms.query.SQLQuery@0” since the connection used is closing
17/01/25 12:29:11 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
17/01/25 12:29:11 INFO metastore.ObjectStore: Initialized ObjectStore
17/01/25 12:29:11 INFO metastore.HiveMetaStore: Added admin role in metastore
17/01/25 12:29:11 INFO metastore.HiveMetaStore: Added public role in metastore
17/01/25 12:29:11 INFO metastore.HiveMetaStore: No user is added in admin role, since config is empty
17/01/25 12:29:12 INFO log.PerfLogger:
17/01/25 12:29:12 INFO metastore.HiveMetaStore: 0: get_all_functions
17/01/25 12:29:12 INFO HiveMetaStore.audit: ugi=cloudera ip=unknown-ip-addr cmd=get_all_functions
17/01/25 12:29:12 INFO DataNucleus.Datastore: The class “org.apache.hadoop.hive.metastore.model.MResourceUri” is tagged as “embedded-only” so does not have its own datastore table.
17/01/25 12:29:12 INFO log.PerfLogger: </PERFLOG method=get_all_functions start=1485376152264 end=1485376152392 duration=128 from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=0 retryCount=0 error=false>
17/01/25 12:29:12 INFO session.SessionState: Created local directory: /tmp/ca28d027-ea21-4a3d-8428-d86d27dae60f_resources
17/01/25 12:29:12 INFO session.SessionState: Created HDFS directory: /tmp/hive/cloudera/ca28d027-ea21-4a3d-8428-d86d27dae60f
17/01/25 12:29:12 INFO session.SessionState: Created local directory: /tmp/cloudera/ca28d027-ea21-4a3d-8428-d86d27dae60f
17/01/25 12:29:12 INFO session.SessionState: Created HDFS directory: /tmp/hive/cloudera/ca28d027-ea21-4a3d-8428-d86d27dae60f/_tmp_space.db
17/01/25 12:29:12 INFO session.SessionState: No Tez session required at this point. hive.execution.engine=mr.
17/01/25 12:29:12 INFO log.PerfLogger:
17/01/25 12:29:12 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=departments
17/01/25 12:29:12 INFO HiveMetaStore.audit: ugi=cloudera ip=unknown-ip-addr cmd=get_table : db=default tbl=departments
17/01/25 12:29:12 INFO log.PerfLogger: </PERFLOG method=get_table start=1485376152779 end=1485376152835 duration=56 from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=0 retryCount=-1 error=true>
Traceback (most recent call last):
File “”, line 1, in
File “/usr/lib/spark/python/pyspark/sql/context.py”, line 580, in sql
return DataFrame(self._ssql_ctx.sql(sqlQuery), self)
File “/usr/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py”, line 813, in call
File “/usr/lib/spark/python/pyspark/sql/utils.py”, line 51, in deco
raise AnalysisException(s.split(’: ‘, 1)[1], stackTrace)
pyspark.sql.utils.AnalysisException: u’Table not found: departments; line 1 pos 14’
depts = sqlContext.sql(“select * from departments”)
17/01/25 12:31:32 INFO log.PerfLogger:
17/01/25 12:31:32 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=departments
17/01/25 12:31:32 INFO HiveMetaStore.audit: ugi=cloudera ip=unknown-ip-addr cmd=get_table : db=default tbl=departments
17/01/25 12:31:32 INFO log.PerfLogger: </PERFLOG method=get_table start=1485376292056 end=1485376292058 duration=2 from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=0 retryCount=-1 error=true>
Traceback (most recent call last):
File “”, line 1, in
File “/usr/lib/spark/python/pyspark/sql/context.py”, line 580, in sql
return DataFrame(self._ssql_ctx.sql(sqlQuery), self)
File “/usr/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py”, line 813, in call
File “/usr/lib/spark/python/pyspark/sql/utils.py”, line 51, in deco
raise AnalysisException(s.split(’: ‘, 1)[1], stackTrace)
pyspark.sql.utils.AnalysisException: u’Table not found: departments; line 1 pos 14’
Did you setup 1.2 on CDH 5.8 or use 1.6?