Pyspark itversity Traceback error - is pyspark stable to process all type of data format?

pyspark

#1

path="/home/shantanil/data/retail_db_json/customers/part-r-00000-70554560-527b-44f6-9e80-4e2031af5994"
df = sqlContext.read.json(path)
18/02/18 08:02:42 INFO JSONRelation: Listing hdfs://nn01.itversity.com:8020/home/shantanil/data/retail_db_json/customers/part-r-00000-70554560-527b-44f6-9e80-4e2031af5994 on driver
18/02/18 08:02:42 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 339.4 KB, free 1075.0 KB)
18/02/18 08:02:42 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 28.4 KB, free 1103.4 KB)
18/02/18 08:02:42 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on localhost:41623 (size: 28.4 KB, free: 511.0 MB)
18/02/18 08:02:42 INFO SparkContext: Created broadcast 2 from json at NativeMethodAccessorImpl.java:-2
Traceback (most recent call last):
File “”, line 1, in
File “/usr/hdp/2.5.0.0-1245/spark/python/pyspark/sql/readwriter.py”, line 176, in json
return self._df(self._jreader.json(path))
File “/usr/hdp/2.5.0.0-1245/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py”, line 813, in call
File “/usr/hdp/2.5.0.0-1245/spark/python/pyspark/sql/utils.py”, line 45, in deco
return f(*a, **kw)
File “/usr/hdp/2.5.0.0-1245/spark/python/lib/py4j-0.9-src.zip/py4j/protocol.py”, line 308, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o133.json.
: java.io.IOException: No input paths specified in job
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:202)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:199)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:242)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:240)
at scala.Option.getOrElse(Option.scala:120)