Flume - What happens when HDFS is down?

In a flume configuaration- Say for example if my source is a near real time JMS & sink is Hadoop & for some reason my Hadoop/HDFS sink crashes & does not recover for a day. Where does my real time data remain during all this while?

Where is it saved ?

Flume starts throwing exceptions once the channel fills up in the event of a Sink failure ie HDFS in this case. Hence a good practice is to have a mechanism to address this kind of failures. Generally, a Failover Sink Processor is used. We need to have a failover sink in case of failures to avoid data loss.

a1.sinkgroups = g1
a1.sinkgroups.g1.sinks = k1 k2
a1.sinkgroups.g1.processor.type = failover
a1.sinkgroups.g1.processor.priority.k1 = 5
a1.sinkgroups.g1.processor.priority.k2 = 10
a1.sinkgroups.g1.processor.maxpenalty = 10000

1 Like