Using Flume in bigdata-labs

Hi,

Can’t we practice telnet-flume tasks in bigdata-labs?

@mangleeswaran - Are you facing any challenge ?

@gnanaprakasam : Yes. I am trying to sink into my hdfs
using
a1.sinks.k1.hdfs.path =hdfs://gw01.itversity.com/user/mangleeswaran/flume

I am able to start my socketserver with port 44444.
Opened another console and typed telnet localhost 44444
Now in third console i am trying to cat /user/mangleeswaran/flume. But it says no such directory.

Am I missing anything here?

@mangleeswaran - Could you please past your flume code

Hi @gnanaprakasam,

Please find the code below

Name the components on this agent

a1.sources = r1
a1.sinks = k1
a1.channels = c1

Describe/configure the source

a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444

Describe the sink

a1.sinks.k1.type = hdfs
a2.sinks.k1.hdfs.path =hdfs://gw01.itversity.com/user/mangleeswaran/flume
a1.sinks.k1.filePrefix = netcat

Use a channel which buffers events in memory

a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

Bind the source and sink to the channel

a1.sources.r1.channels = c1
a1.sinks.k1.channels = c1

@mangleeswaran - Can you try below hdfs path
hdfs://nn01.itversity.com:8020/mangleeswaran/flume

hdfs://nn01.itversity.com:8020/user/mangleeswaran/flume

Hi Mangleeswaran,

I am having similar problem. I have changed the URL as mentioned by @gnanaprakasam but it does not create files prefixed with netcat in HDFS dir.

Can you tell me how you resolved it?

Regards, Raj

Hi, @gnanaprakasam

I am getting different error while creating socket server ‘address already in use’. I think its related to port. Am i right?

I am getting same error. Is it resolved on your side?

Regards

@mangleeswaran @raj_sharma - I tried to recreate the issue, after changing the port number its working fine. Could you please check it?

@gnanaprakasam : port number 44444?

@mangleeswaran - Since you are facing problem with 44444, you can try some thing else like 55555.

Hi @gnanaprakasam, @itversity, @mangleeswaran ,

I am able to use HDFS as sink and create files prefixed with netcat in HDFS.

Next task is: integrate streaming data generated by log generator simulating web server generating logs

In this task, following steps are performed:

  1. Run start_logs command. It will generate logs simulating web server logs
    2.Validate by running tail_logs command. You can see streaming logs generated
  2. Log file name: /opt/gen_logs/logs/access.log

I am unable to run start_logs command as it requires me use sudo password which I do not have!

Any idea what’s sudo pass/how to resolve it?

Regards, Raj

@raj_sharma : you used port 55555?

Hi,
I used port 40404 and it worked fine as mentioned in above post. I am facing problem with start_logs command which needs to be triggered using sudo but I do not have sudo password. Were you able to run start_logs command? How?

Regards,Raj

No. I dont think so we have given sudo access.

HI @raj_sharma

Im using Bigdata lab for practical purpose.
Can you tell me in which path the file prefixed with necat would be saved.

Hi Janaki_K,

It will be created in your HDFS path given like below configurations:
and the file name will start with netcat. This will happen if you have your configurations like below:

Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444

Describe the sink
a1.sinks.k1.type = hdfs
a2.sinks.k1.hdfs.path =hdfs://gw01.itversity.com/user/Janaki_K/flume
a1.sinks.k1.filePrefix = netcat

Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channels = c1

After running your job completion,list the files in path as
hadoop fs -ls ls /user/Janaki_K/flume
and you will see list of files starting with netcat

Hope its clear.
Regards, Raj

I am trying to run start_logs script to generate logs in /opt/gen_logs/logs/access.log , it is throwing permissions denied

s

I see that access.log is created by user pratyush04 and others do not have permissions on this. I was able to run start_logs yesterday. Can the system admin fix the permissions on this please ?