Input path does not exist: hdfs://



Traceback (most recent call last):
File “”, line 1, in
File “/usr/hdp/”, line 1267, in take
totalParts = self.getNumPartitions()
File “/usr/hdp/”, line 356, in getNumPartitions
return self._jrdd.partitions().size()
File “/usr/hdp/”, line 813, in call
File “/usr/hdp/”, line 45, in deco
return f(*a, **kw)
File “/usr/hdp/”, line 308, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o228.partitions.
: org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://

Directory location…
[ashoo1234@gw03 retail_db]$ ls -ltr /data/retail_db/orders/part-00000
-rw-r–r-- 1 root root 2999944 Feb 20 2017 /data/retail_db/orders/part-00000


This path is in local system. For cluster environments, we need to give the data which in on HDFS. Please use the below path where data in HDFS.

hadoop fs -ls /public/retail_db/orders/
Found 1 items
-rw-r--r--   3 hdfs hdfs    2999944 2016-12-19 03:52 /public/retail_db/orders/part-00000


Thanks this worked when HDFS path provided.

Could you please also set the debugging level to ERROR instead of INFO.


It is set to INFO,console. For applications, the default root logger is “INFO, console”, which logs all message at level INFO and above to the console’s stderr.