Error on Pyspark

pyspark
scala
Error on Pyspark
0.0 0

#1

/ / ___ / /
\ / _ / _ `/ __/ '/
/
/ .
_/_,// //_\ version 2.0.0.2.5.0.0-1245
/
/

Using Python version 2.7.5 (default, Aug 4 2017 00:39:18)
SparkSession available as ‘spark’.

autoDf1 = sqlContext.read.csv(“data/auto-data.csv”,header=True)
18/05/17 02:46:35 WARN RetryInvocationHandler: Exception while invoking ClientNamenodeProtocolTranslatorPB.getFileInfo over null. Not retrying because try once and fail.
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=rkarra777, access=EXECUTE, inode="/user/rkarra777/data/auto-data.csv/_spark_metadata":rkarra777:hdfs:-rw-r–r--
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:259)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:205)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)


#2

I am having access issues

autoDf1 = sqlContext.read.csv(“data/auto-data.csv”,header=True)
18/05/17 02:46:35 WARN RetryInvocationHandler: Exception while invoking ClientNamenodeProtocolTranslatorPB.getFileInfo over null. Not retrying because try once and fail.
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=rkarra777, access=EXECUTE, inode="/user/rkarra777/data/auto-data.csv/_spark_metadata":rkarra777:hdfs:-rw-r–r--


#3

@Ravinder_Karra

Launch pyspark using below command
pyspark --conf "spark.ui.port=54431" --master yarn --packages com.databricks:spark-csv_2.10:1.4.0

Read by using below code:

df = sqlContext.read.format('com.databricks.spark.csv').options(header='true', inferschema='true').load('data/auto-data.csv')