Hello, I am doing practice of using flume to receive log but am seeing some issue here, it would be greatly appreciated if someone can please help:

The conf:
[paslechoix@gw01 flume]$ cat flume23.conf
#Define source , sink , channel and agent.
agent1.sources = source1
agent1.sinks = sink1
agent1.channels = channel1

Describe/configure source1

agent1.sources.source1.type = exec
agent1.sources.source1.command = tail -F logs/access.log
#Define interceptors

Describe sink1 = memory-channel
agent1.sinks.sink1.type = hdfs
agent1.sinks.sink1.hdfs.path = /home/paslechoix/flume/received/%y/%m/%d/%H%M
agent1.sinks.sink1.hdfs.fileType = DataStream

Now we need to define channel1 property.

agent1.channels.channel1.type = memory
agent1.channels.channel1.capacity = 1000
agent1.channels.channel1.transactionCapacity = 100

Bind the source and sink to the channel

agent1.sources.source1.channels = channel1 = channel1

Seeing the original log has grown way too big, I replicated the /opt/gen_logs to my own folder /home/paslechoix, as you can see from the conf, I specified subfolder “received” as the place where the received file would be saved to and I set this folder’s attribute to 777

[paslechoix@gw01 flume]$ ll /home/paslechoix/flume
total 40
drwxr-xr-x 2 paslechoix students 4096 Feb 4 07:01 data
-rw-r–r-- 1 paslechoix students 944 Feb 4 07:13 flume23.conf
drwxr-xr-x 2 paslechoix students 4096 Feb 4 07:01 lib
drwxr-xr-x 2 paslechoix students 4096 Feb 4 07:17 logs
-rw-r–r-- 1 paslechoix students 777 Feb 3 21:28 paslechoix_agent2.conf
-rw-r–r-- 1 paslechoix students 434 Feb 3 20:11 paslechoix_agent.conf
drwxrwxrwx 2 paslechoix students 4096 Feb 4 07:13 received
-rwxr-xr-x 1 paslechoix students 75 Feb 4 07:05
-rwxr-xr-x 1 paslechoix students 131 Feb 4 07:05
-rwxr-xr-x 1 paslechoix students 51 Feb 4 07:05

Why I am still seeing this error in flume-ng output:

Permission denied: user=paslechoix, access=WRITE, inode="/home/paslechoix/flume/received/18/02/04/0717/FlumeData.1517746647434.tmp":hdfs:hdfs:drwxrwxr-x

Thank you very much.


and by the way, the flume agent has been keep showing the error despite I executed the stop log command and confirmed no more python process is active.



There is an issue in your flume.conf, while configuring flume sinks, you mentioned as below:

Mentioned HDFS as the channel, but there is no path in HDFS that starts with /home/
HDFS path always starts with /user/

So the correct path would be::



Thank you very much Ravi, that is the fix! What about if I need to write the logs to a file system location like /home/paslechoix/flume/received/%y/%m/%d/%H%M (this sounds silly but I ask this for learning purpose)



You can use ‘File Roll Sink’: