Is it possible to read hdfs file using python?

Can some one tell me how can I open and process the files in hdfs inside a python program ?

You can refer to following python hadoop library.

http://crs4.github.io/pydoop/api_docs/hdfs_api.html

Sample code:

import pydoop.hdfs as hdfs

cat = subprocess.Popen([“hadoop”, “fs”, “-cat”, “/path/to/myfile”], stdout=subprocess.PIPE)
for line in cat.stdout:
print line

1 Like