Number of Mappers

Lets say I have 5 tb file and block size is 128 mb, then Number of mappers = 1048576/128 = 8192.
So my questions does this much number of mappers are possible or there would be few mappers and rest of the mapper will be in pending state in queue?

Depending on the number of mapper slots available in the cluster, mappers will be executed and once the existing mapper slots are free then the pending mappers will be executed.
For example, if you have 256 mapper slots, it will take 32 iterations on the 256 mapper slots. 256*32 = 8192.

@itversity, please correct me if I am wrong.

1 Like

How can I check how many number of mapper slots are available in my cluster?
50 nodes, 32 GB ram size, 1 Tb hard disk, 8 core processor for each node, how this file will be processed?

In Hadoop, there will be separate Mapper and Reducer slots. It usually depends on the number of cores in the cluster. Since there are 50 nodes and 8 cores in each node, it would be 50*8 = 400 slots in total. If the application is processing intensive, you can increase the slots further using virtual cores. Assuming there are 400 slots in total, you have a subset of them for your mappers, I am not sure whether this number is configurable, Hadoop administration experience guys can provide a more insightful answer.

Please correct if there is anything wrong in my answer @itversity @venkatreddy-amalla

1 Like

Yes please. I need in depth details for it !!

Here is the video which explain schedulers in detail

Mappers will be executed as per availability, rest will remain queue.

2 Likes

Thank you so much Durga for the video !!