Setting up Ganglia with Microsoft Azure HDInsight


What is Ganglia?

Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. It is based on a hierarchical design targeted at federations of clusters. It leverages widely used technologies such as XML for data representation, XDR for compact, portable data transport, and RRDtool for data storage and visualization. It uses carefully engineered data structures and algorithms to achieve very low per-node overheads and high concurrency. The implementation is robust, has been ported to an extensive set of operating systems and processor architectures, and is currently in use on thousands of clusters around the world. It has been used to link clusters across university campuses and around the world and can scale to handle clusters with 2000 nodes. http://ganglia.info/

What are we going to do in these instructions?

In order to use Ganglia to collect WASB metrics with HDInsight on Linux we are going to create the following setup: A cluster in a custom virtual network with a Linux VM provisioned in the same subnet in order to run the Ganglia server.

You should be able to find the scripts and ARM template to deploy this configuration here [https://github.com/hdinsight/Edge-Node-Scripts/tree/master/Ganglia], we will be going through what the ARM template does, which should allow you to replicate the steps manually if you so choose.

Provision Virtual Network

The first thing to do is to provision a Resource Manager virtual network in the Azure portal. Instructions to do so manually can be found here: https://azure.microsoft.com/en-us/documentation/articles/virtual-networks-create-vnet-arm-pportal/

Provision Ubuntu Virtual Machine

After provisioning the virtual network, it is time to provision a VM in that virtual network. We will be using Ubuntu 14.04 Server as the Linux distribution for our VM. We will use SSH tunneling in order to view the dashboard, however you can add a rule to allow inbound connections on port 80, to allow access to the Ganglia dashboard, although that is not recommended.

Install and Configure Ganglia on Virtual Machine

The next step is to install Ganglia on the Linux VM we just provisioned. There is a script, gangliaEdgeNodeScript.sh that will automate the installation of this.

This requires us to SSH into the VM to install and configure the server. Once we are logged into the machine we run the following instructions:

This command installs the Ganglia server
sudo DEBIAN_FRONTEND=noninteractive apt-get install -y --force-yes ganglia-monitor rrdtool gmetad ganglia-webfrontend
This command sets up the graphical dashboard
sudo cp /etc/ganglia-webfrontend/apache.conf /etc/apache2/sites-enabled/ganglia.conf
Next we need to edit the ganglia configuration file, in order to set up the channel on which the Ganglia server receives metrics. We will use vi in order to edit the file:
sudo vi /etc/ganglia/gmond.conf
Once the editor comes up locate the section which refers to udp_send_channel. There we will make the following edits, disable mcast and make sure we can send the packets to the Ganglia server IP address. We need to this to make sure all VMs on the virtual network can reach the host. We are going to remove the lines in red and add the lines in green.

udp_send_channel   {

  mcast_join = 239.2.11.71

  host = localhost

  port = 8649

  ttl = 1

}

 

udp_recv_channel {

  mcast_join = 239.2.11.71

  port = 8649

  bind = 239.2.11.71

}

This change can be done using the following commands as well:

sudo sed -i '0,/mcast_join = 239.2.11.71/{s/mcast_join = 239.2.11.71/host = localhost/}'  /etc/ganglia/gmond.conf
sudo sed -i '0,/mcast_join = 239.2.11.71/{s/mcast_join = 239.2.11.71//}'  /etc/ganglia/gmond.conf
sudo sed -i '0,/bind = 239.2.11.71/{s/bind = 239.2.11.71//}'  /etc/ganglia/gmond.conf

After making the configuration file changes we need to restart the services that we just configured, in order for the changes to take effect.

sudo service ganglia-monitor restart && sudo service gmetad restart && sudo service apache2 restart

Now you should be able to access the Ganglia Dashboard through ssh tunneling for <host_ip>/ganglia, or in the script’s case http://<clustername>-<edgenode vm name>.<region>.cloudapp.azure.com/ganglia. You can also open up port 80 on the VM, although that is not recommended. In my case the public IP of the VM that ran Ganglia was 40.76.205.71, so I would access Ganglia dash board at http://40.76.205.71/ganglia.

Provision HDInsight cluster with custom action script

Using the following script, or enableGanglia.sh script, upload to storage, and create a cluster with custom script following instructions here, passing the parameter "<edgenode vm name>" into the script action script (https://azure.microsoft.com/en-us/documentation/articles/hdinsight-hadoop-customize-cluster/)

sudo sed -i "/# WASB metric/a azure-file-system.sink.ganglia.servers=$1:8649" /etc/hadoop/conf/hadoop-metrics2-azure-file-system.properties
sudo sed -i '/# WASB metric/a *.sink.ganglia.period=60' /etc/hadoop/conf/hadoop-metrics2-azure-file-system.properties
sudo sed -i '/# WASB metric/a *.sink.ganglia.record.filter.include=azureFileSystem' /etc/hadoop/conf/hadoop-metrics2-azure-file-system.properties
sudo sed -i '/# WASB metric/a *.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31' /etc/hadoop/conf/hadoop-metrics2-azure-file-system.properties

Go into Ganglia dashboard to see metrics

WASB metrics will now flow into your Ganglia server, you can go into the Ganglia dashboard to see them.

 

ganglia

Comments (0)

Skip to main content