YARN Schedulers -- FIFO, Fair, and Capacity

#1

As part of this section, we will understand schedulers in YARN in detail. There are different schedulers – FIFO, Fair, and Capacity.≈

  • Schedulers Overview
  • FIFO Scheduler
  • Introduction to Fair Scheduler
  • Configure Fair Scheduler
  • Fair Scheduler – examples
  • Introduction to Capacity Scheduler
  • Configure Capacity Scheduler
  • Capacity Scheduler – examples

Schedulers Overview

Let us go through the overview of schedulers. YARN supports three types of Schedulers – FIFO Scheduler, Fair Scheduler, and Capacity Scheduler.

  • FIFO Scheduler – default with plain vanilla Hadoop and typically used for exploratory purposes.
  • Fair Scheduler – Resources will be allocated to all the subsequent jobs in Fair Manner, default with Cloudera distribution.
  • Capacity Scheduler – Nothing but FIFO Scheduler within each queue, default with Hortonworks distribution.

FIFO Scheduler

Let us get into details with respect to FIFO Scheduler. We will see how to configure and also run jobs to understand how it actually schedule the jobs.

  • FIFO means First In First Out. As the name indicates, the job submitted first will get priority to execute. FIFO is a queue-based scheduler.
  • If we setup Cluster using Plain Vanilla Hadoop, First In First Out (FIFO) is the default scheduler.
  • Allocates resources based on arrival time. If there is a long-running job which takes up all the capacity, resources will not be allocated to other jobs until the job reach a point where required resources for the job is less than the capacity of the cluster.
  • Due to the above reason, if there is a critical small job submitted when the long-running job is running it has to wait until the earlier jobs do not require all the capacity.
  • However, in Production Clusters, we need to either use Fair Scheduler or Capacity Scheduler.

Configure FIFO Scheduler

  • Log in to Cloudera Manager and go to YARN and then click on Configuration
  • Search for “Scheduler”
  • Select the property “ yarn.resourcemanager.scheduler.class ” in yarn-site.xml to org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler to enable the FIFO scheduling policy in your YARN cluster.

Submitting jobs and Validating FIFO Scheduler

  • Submit the Long running job to production queue by using the below command. This job requires 279 containers.

Introduction to Fair Scheduler

Instead of waiting until long-running to be close to complete, resources will be allocated to all the subsequent jobs in Fair Manner.

  • Available resources will be shared evenly between all the outstanding jobs.
  • By default, Cloudera Hadoop Distribution uses Fair Scheduler.
  • Configuration files related to the fair scheduler
    • yarn-site.xml
    • fair-scheduler.xml – allocation file
  • To customize the Fair Scheduler, set configuration properties in yarn-site.xml and update the Fair Scheduler allocation file to add new queues or update existing queues. We can change the Scheduling Policy, update allocations and assign weights to the queues.
  • E.g.: We will be defining the queues as below.
    • Defining 3 Primary queues in the form of XML file.
      • Prod Queue with 80% weight – root.prod
      • QA Queues with 10% Weight – root.qa
      • ETL Queue with 10% weight – root.etl
    • We can also specify scheduling policy within each queue, drf or fair or fifo
    • The default is drf – Dominant Resource Fairness. While the fair is based on Memory, drf is based on multiple resource types.

Configure Fair Scheduler

Let us see how we can configure Fair Scheduler using Cloudera Manager.

  • Log in to Cloudera Manager and go to YARN and then click on Configuration
  • Search for “Scheduler”
  • Select the property “ yarn.resourcemanager.scheduler.class ” to “ org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
  • Add the XML file configuration in the property – “Fair Scheduler XML Advanced Configuration Snippet (Safety Valve)”
  • And then click on “Save Changes”
  • Refresh the Yarn configuration to change the Yarn Scheduler
  • Validate the scheduler configuration from the Resource Manager UI – :8088/cluster/scheduler

Run Sample Job

First, let us run jobs without specifying any queue and see what happens.

  • Submit the Long running job to production queue by using the below command. This job requires 279 containers.
  • Submit the other job which requires only 18 containers to process the data.

Submitting jobs and Validating Fair Scheduler

Let us submit few jobs and see how the resources are allocated using Fair Scheduler.

  • Submit the Long running job to the production queue.
  • Submit the other jobs to the test and production queues.
  • Submit another job to the qa queue
  • Now the production queue has two apps. Since it is a fair scheduler the two jobs will be executed once the resources are available in the production queue and the job in the test queue will be executed as it is with the required resources.
  • Here is how the jobs will run with Fair Scheduler

More Control Parameters

Let us review some of the properties related to Fair Scheduler.

  • There are properties as part of yarn-site.xml which can be used to overwrite the behavior of Fair Scheduler.
    • yarn.scheduler.fair.user-as-default-queue
    • yarn.scheduler.fair.preemption
    • yarn.scheduler.fair.preemption.cluster-utilization-threshold
  • Also there are several properties that can be defined as part of the allocation file (fair-scheduler.xml). We can review properties from this URL.
    • Queue element – Representing queues. It has the following properties
      • minResources — Setting the minimum resources of a queue
      • maxResources — Setting the maximum resources of a queue
      • maxRunningApps — Setting the maximum number of apps from a queue to run at once
      • weight — Sharing the cluster non-proportional with other queues. Default to 1
      • scheduling policy — Values are fifo , fair , drf or any class that extends org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.scheduling policy
      • aclSubmitApps — Listing the users who can submit apps to the queue
      • minSharePreemptionTimeout — Specifying the number of secs
    • User elements – Representing user behaviors. It can contain a single property to set a maximum number of apps for a particular user.
    • userMaxAppsDefault element – Setting the default running app limit for users if the limit is not otherwise specified.
    • fairSharePreemptionTimeout element – Setting the number of seconds a queue is under its fair share before it tries to preempt containers to take resources from other queues.
    • defaultQueueSchedulingPolicy element – Specifying the default scheduling policy for queues; overridden by the scheduling policy element in each queue if specified.

Introduction to Capacity Scheduler

Capacity Scheduler is nothing but FIFO Scheduler within each queue. Unlike FIFO Scheduler, Capacity Scheduler has multiple queues and users can submit jobs to a particular queue.

  • By default, Hortonworks Hadoop Distribution uses Capacity Scheduler.
  • Configuration files related to the capacity scheduler
    • yarn-site.xml
    • capacity-scheduler.xml
  • For setting up queues in Capacity Scheduler you need to make changes in the capacity-scheduler.xml configuration file.
  • Click here for the complete documentation of Capacity Scheduler (from Hortonworks).
  • E.g.: We will be defining the three primary queues as below.
    • prod – queue with 50% of total capacity
    • dev – queue with 10% of total capacity
    • qa – queue with 40% of total capacity

Configure Capacity Scheduler

Let us see how we can configure Capacity Scheduler using Cloudera Manager.

  • Log in to Cloudera Manager and go to YARN and then click on Configuaration
  • Search for “Scheduler”
  • And then define the property with multiple queues in the property “ Capacity Scheduler Configuration Advanced Configuration Snippet (Safety Valve) ” with different Name and Value as below.
    • yarn.scheduler.capacity.root.queues – prod,dev, and qa
    • yarn.scheduler.capacity.root.capacity – 100
    • yarn.scheduler.capacity.root.prod.capacity – 50
    • yarn.scheduler.capacity.root.dev.capacity – 10
    • yarn.scheduler.capacity.root.qa.capacity – 40
  • And then Click on “Save Changes”.
  • And then set the property “ yarn.resourcemanager.scheduler.class ” to “ org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.CapacityScheduler ” to activate the Capacity Scheduler.
  • And then restart the yarn service to “Deploying the client configuration”.
  • Validate the scheduler configuration from the Resource Manager UI – :8088/cluster/scheduler

Run Sample Job

First, let us run jobs without specifying any queue and see what happens.

  • Submit the Long running job to production queue by using the below command. This job requires 279 containers.
  • Submit the other job which requires only 18 containers to process the data.

Submitting jobs and Validating Capacity Scheduler

Let us submit few jobs and see how the resources are allocated using Capacity Scheduler.

  • Submit the Long running job to the production queue.
  • Submit the other jobs to the test and production queues.
  • Submit another job to the qa queue
  • Now the production queue has two apps. Since it is a Capacity Scheduler the two jobs will be executed once the resources are available in the production queue and the job in the test queue will be executed as it is with the required resources.
  • Here is how the jobs will run with Capacity Scheduler
0 Likes