Not able to get the output using the Map function with RDD



Please see the attached screenshot. I am facing this error for 2 days straight now. Is there a problem with my commands or the environment, not sure. Any help is greatly appreciated.

The error says “org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://




Can you send the full code it would be helpful to resolve your issue.

Sunil Abhishek


Hi Sunil

orderItems = sc.textFile("public/retail_db/order_items")
orderItemsMap = o:(int(o.split(",")[1]), float(o.split(",")[4]) ))
for i in orderItemsMap.take(10): print(i)

These are just the three lines in the code. That’s it. I see the error after the **_for loop_**


Hi Sunil

Thanks for helping, I have figured out the problem. It was because of the path of the dataset I was using. I was missing a forward first slash in the path, so the system was not searching in the root directory.
Path: orderItems = sc.textFile(“public/retail_db/order_items”)
Correct Path = orderItems = sc.textFile("/public/retail_db/order_items")