Manage the Clusters

#1

Maintain and modify the cluster to support day-to-day operations in the enterprise

  • Rebalance the cluster
  • Set up alerting for excessive disk fill
  • Define and install a rack topology script
  • Install new type of I/O compression library in cluster
  • Revise YARN resource assignment based on user feedback
  • Commission/Decommission a node

Rebalance the cluster

Let us see how we can rebalance the cluster.

  • Rebalancing is typically related to HDFS.
  • We have a component called as balancer as part of HDFS.
  • The balancer can run by itself, however, we can kickoff balancer by ourselves.
    • We can use Cloudera Manager Web Interface to rebalance the cluster. Go to Actions -> Rebalance the Cluster
    • We can also use hdfs balancer command to balance data in all data nodes in the cluster.
    • To balance data within disks on a given node, we can use hdfs diskbalancer command. However, this is not used that common.

Set up alerting for excessive disk fill

Let us see how we can configure alerts by taking excessive disk fill as an example.

As part of the setup process, we have setup Cloudera Management Service. One of the components that are set up as part of Cloudera Management Service is Alert Publisher .

Alert Publisher facilitate the support team to receive alerts in a timely manner.

  • Make sure Alert Publisher is setup. If not, you need to go to instances in Cloudera Management Service , then click on Add Role Instances and configure Alert Publisher .
  • Once Alert Publisher is added, we can go to Configuration and set up Alerts.
  • Enable Alerts: Enable Email Alerts
  • Configure SMTP details. We can leave defaults on the current cluster created for learning purposes. However, in actual production clusters, we need to get SMTP details such as IP Address or DNS Alias, Username and Password and then we need to configure SMTP to send alerts.
  • Configure Recipient Details under Alerts: Mail Message Recipients

Once you configure the alert publisher we can enable alert from any of the services. Let us see the details with respect to HDFS.

  • Go to HDFS -> Configuration
  • Search for alert
  • Enable (Check) Enable Service Level Health Alerts .
  • We can now go to Administration -> Alerts
  • We can click on HDFS and review all the alerts configured.

Define and install a rack topology script

This topic is extensively covered as part of Install CM and CDH – Configure HDFS and Understand ConceptsConfigure Rack Awareness

Add I/O Compression Library

Install new type of I/O compression library in the cluster

Compression brings the following advantages

  • Reduces space in the cluster to store large files
  • Data transfer speed increases across the network while processing the data

Hadoop supports following compression techniques or codes and these codecs are installed by default along, so no separate installation is needed.

  • gzip – org.apache.hadoop.io.compress.GzipCodec
  • bzip2 – org.apache.hadoop.io.compress.BZip2Codec
  • LZO – com.hadoop.compression.lzo.LzopCodec
  • Snappy – org.apache.hadoop.io.compress.SnappyCodec
  • Deflate – org.apache.hadoop.io.compress.DeflateCodec

To add a compression type

  • Go to HDFS in Cloudera Manager
  • Select configuration
  • Search ‘compression’ and add the codec you want to use.

Steps to configure LZO compression, we have to install GPL Extras and then configure HDFS.

Using Packages

  • Don’t run this if your cluster is managed by Parcels.
  • Link for GPLEXTRAS5 repositories –
    http://archive.cloudera.com/gplextras5/redhat/7/x86_64/gplextras/5.12.0/RPMS/x86_64/
  • Get the repo file cd /etc/yum.repos.d/wget http://archive.cloudera.com/gplextras5/redhat/7/x86_64/gplextras/cloudera-gplextras5.repo
  • Install lzo, lzop and hadoop-lzo packages on the all nodes of the cluster. yum install lzo lzop hadoop-lzo
  • Configuring the HDFS with codecs – com.hadoop.compression.lzo.LzoCodec
  • Save your configuration changes.
  • Restart HDFS.
  • Redeploy the HDFS client configuration.
  • We can validate by running Sqoop Import to ensure that data is compressed using com.hadoop.compression.lzo.LzoCodec

Using Parcels

Here are the instructions to enable LZO compression using Parcels.

  • Configure Parcel – https://archive.cloudera.com/gplextras5/parcels/COMPATIBLE_VERSION (5.15.1 in our case)
  • Download, Distribute and Activate
  • Configuring the HDFS with codecs – com.hadoop.compression.lzo.LzoCodec
  • Save your configuration changes.
  • Restart HDFS.
  • Redeploy the HDFS client configuration.
  • We can validate by running Sqoop Import to ensure that data is compressed using com.hadoop.compression.lzo.LzoCodec

https://gist.githubusercontent.com/dgadiraju/51c087335ba2ed80c415f2f8616eb19e/raw/4a7849debc8059b38e4541c87c4a94af1f11eec4/cdh-admin-validate-lzo.sh

YARN Resource Assignment

Revise YARN resource assignment based on user feedback.

The processing capacity of the cluster is determined by the capacity given to YARN. In our case, both Map Reduce as well as Spark is running using YARN.

  • Node Manager Capacity
  • Container Capacity
  • Map Reduce – Map Task and Reduce Task Capacity
  • Spark – Executores Capacity

As the learning objective is primarily about YARN Resource Management, let us review the important properties with respect to YARN and see how the changes will reflect in Resource Manager Web UI.

https://gist.githubusercontent.com/dgadiraju/e8b8687e00a8b43087348068e536c18d/raw/527ab1fb883ca3f1186c69ef28f1214a300cf168/cdh-admin-yarn-important-properties.csv

Let us review this article about how properties should be configured for Resource Management.

Commission/Decommission a node

Let us see the details with respect to commissioning or decommissioning a node. Decommissioning is nothing but stopping roles on the cluster.

As discussed earlier, we have 3 types of Nodes. Behavior is a bit different for each type of Node.

  • Gateway
  • Master
    • Maintenance – Decommission -> Complete Maintenance -> Recommission.
    • Removing the hosts – Decommission -> Remove from the cluster. We need to migrate the master components of services to other masters in the cluster and then delete the components before decommissioning the node. It is better to work on one master at a time and it is not very common.
    • Typically when we decommission masters, we end up decommissioning the entire cluster.
  • Worker (we will see a demo on bigdataserver-8).
    • Maintenance – Multiple servers at a time. Decommission -> Complete Maintenance -> Recommission. When it comes to HDFS, we have options either to Decommission or Take Offline . For short term maintenance, we should use the Take Offline option.
    • Removing the hosts – Decommission -> Remove from the cluster. Don’t use Take Offline option while removing from the cluster.
    • When we decommission worker nodes, they will be temporarily discarded by the masters.
      • Namenode will ignore those nodes to copy any data.
      • Resource Manager will not use those nodes for processing the data.
      • Same is the case with Impala, HBase etc.

Decommissioning a Node from Cluster

Note: We also have the option to Decommission Datanode or Take Offline for worker nodes on top of Decommission Hosts.

We cannot decommission a Datanode or a node with a Datanode if the number of Datanodes equals the replication factor (which by default is three).

STEPS FOR DECOMMISSION
  • Go to Hosts
  • Select the host or hosts that you want to decommission
  • Click on Action -> Select Hosts Decommission
  • Another pop-up window displaying the services that are running on that node
  • Click Confirm to decommission on that particular node

Recommission

Applicable only for hosts decommissioned using Cloudera Manager.

STEPS FOR RECOMMISSION
  • Click on the Hosts tab.
  • Select one or more decommissioned hosts to recommission
  • Select Actions for Selected -> Hosts Recommission
  • A Recommission Command pop-up window displays for Confirmation
  • Once you confirm pop-up window shows each step in the recommission command as it is run.
  • Once recommission is successful, Select Actions for Selected -> Click on Start Roles on Hosts to start all roles related to that particular node.

Note: If host/node and roles are marked as commissioned, they need not be started.

0 Likes