CCA175 - Is SqlContext & hive context integrated in cloudera environment


For CCA175 , can the sqlContext val be used for both Spark native context and hive context without any changes ? Or do I need to create a hiveContext val? Please assist.

u can use sqlContext name but make sure you point them correctly as
sqlContext = SQLContext(sc)
sqlContext = HiveContext(sc)

Hi Rahul,

In which scenarios we will use Hive context . Please let me the difference of them.


I will be explaining as per my knowledge.
Hive Context: Hive Context is used when ever you want to access Hive Tables.
For Example you have employee table in Hive.If you want to access it using Spark you can do the following way.
sqlContext = HiveContext(sc) or hiveContext = HiveContext(sc)

sqlContext(‘select * from employee’).show()
You can perform what ever operations you need.

SQL Context: If you have a file in HDFS and you want to perform validations using sql. so you can use spark sql here.
For example: /user/horton/testing.txt
we have a file testing in above path. we need to read it and register as temp table and perform validations
sqlContext = SQLContext(sc)

data = sc.textFile(path).map(lambda x: x.split(’,’)).map(lambda y: Row(age = int(y[0]))
Convert the above to dataframe
data_f = sqlContext.createDataFrame(data)
sqlContext.sql(‘select * from new_table’).show()
In this case you won’t be having any table in hive you will be reading data and creating temp table and performing sql operations on it.

I am sorry if i am wrong any where.

Thanks a lot Rahul for clear explanation.