Install ganglia monitoring system on CentOS, RHEL

Overview

Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. It is based on a hierarchical design targeted at federations of clusters. It leverages widely used technologies such as XML for data representation, XDR for compact, portable data transport, and RRDtool for data storage and visualization. It uses carefully engineered data structures and algorithms to achieve very low per-node overheads and high concurrency. The implementation is robust, has been ported to an extensive set of operating systems and processor architectures, and is currently in use on thousands of clusters around the world. It has been used to link clusters across university campuses and around the world and can scale to handle clusters with 2000 nodes.

Install ganglia monitoring system on CentOS, RHEL

Ganglia consists of 3 things

Ganglia Monitoring Daemon (gmond)
Gmond is a multi-threaded daemon which runs on each cluster node you want to monitor.

Gmond has four main responsibilities:

Monitor changes in host state.
Announce relevant changes.
Listen to the state of all other ganglia nodes via a unicast or multicast channel.
Answer requests for an XML description of the cluster state.

Each gmond transmits in information in two different ways:

Unicasting or Multicasting host state in external data representation (XDR) format using UDP messages.
Sending XML over a TCP connection.

Ganglia Meta Daemon (gmetad)

Federation in Ganglia is achieved using a tree of point-to-point connections amongst representative cluster nodes to aggregate the state of multiple clusters. At each node in the tree, a Ganglia Meta Daemon (gmetad) periodically polls a collection of child data sources, parses the collected XML, saves all numeric, volatile metrics to round-robin databases and exports the aggregated XML over a TCP socket to clients. Data sources may be either gmond daemons, representing specific clusters, or other gmetad daemons, representing sets of clusters. Data sources use source IP addresses for access control and can be specified using multiple IP addresses for failover. The latter capability is natural for aggregating data from clusters since each gmond daemon contains the entire state of its cluster.

Ganglia PHP Web Front-end

The Ganglia web front-end provides a view of the gathered information via real-time dynamic web pages. Most importantly, it displays Ganglia data in a meaningful way for system administrators and computer users.

Source http://en.wikipedia.org/

For this post I would be creating a server which will act as Ganglia Meta Daemon (gmetad) and on rest would be installing Ganglia Monitoring Daemon (gmond). Also the gmetad server will have Ganglia PHP Web Front-end installed. On a production environment you may consider having multiple Ganglia Meta Daemon running for High Availability.

Click here to enable epel repository first.

Install some prerequisites.


# yum install gcc gcc-c++ autoconf automake expat-devel libconfuse-devel rrdtool rrdtool-devel apr-devel libconfuse

Download and compile pcre


# wget ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-8.32.tar.gz
# tar zxvf pcre-8.32.tar.gz
# cd pcre-8.32
# ./configure
# make
# make install

Download and compile ganglia. The machine running gmetad should be compiled with --with-gmetad option enabled.


# wget http://sourceforge.net/projects/ganglia/files/latest/download?source=files
# tar zxvf ganglia-3.5.0.tar.gz
# cd ganglia-3.5.0
# ./configure --sysconfdir=/etc/ganglia/ --sbindir=/usr/sbin/ --with-gmetad --enable-static-build
# make
# make install

Generate gmond config file


# gmond --default_config > /etc/ganglia/gmond.conf

At this point you may get this error

gmond: error while loading shared libraries: libpcre.so.1: cannot open shared object file: No such file or directory

Create a symlink and execute the command again


# ln -s /lib64/libpcre.so.0.0.1 /lib64/libpcre.so.1
# gmond --default_config > /etc/ganglia/gmond.conf

Open the configuration file in an editor


# vi /etc/ganglia/gmond.conf

Lets make this machine part of MySQL cluster as this is a server running MySQL database.


cluster {
name = "MySQL"
owner = "Database TEAM"
latlong = "unspecified"
url = "unspecified"
}

Copy the init script and start the gmond service


# cd ganglia-3.5.0/gmond
# cp gmond.init /etc/init.d/gmond
# /etc/init.d/gmond start
# chkconfig --add gmond
# chkconfig gmond on

Now create rrd directory, change permissions, copy the gmetad init script and open it in an editor of choice.


# mkdir -p /var/lib/ganglia/rrds/
# chown nobody:nobody /var/lib/ganglia/rrds/
# cd ganglia-3.5.0/gmetad
# cp gmetad.init /etc/init.d/gmetad
# vi /etc/init.d/gmetad

Comment the line starting with daemon and add the following line next to it


# daemon $GMETAD
($GMETAD -c /etc/ganglia/gmetad.conf -d 1 > /dev/null 2>&1 ) &

Start the gmetad process


# /etc/init.d/gmetad start
# chkconfig --add gmetad
# chkconfig gmetad on

To install the web interface, install the prerequisites


# yum install httpd php

Download the web interface, untar it, move it to the document root of the web server and finally run make install


# wget http://sourceforge.net/projects/ganglia/files/ganglia-web/3.5.7/ganglia-web-3.5.7.tar.gz/download
# tar zxvf ganglia-web-3.5.7.tar.gz
# mv ganglia-web-3.5.7 /var/www/html/ganglia
# cd /var/www/html/ganglia
# make install

On the client machines which would be running gmond service follow the steps

Download and compile ganglia. This time without gmetad option enabled.


# wget http://sourceforge.net/projects/ganglia/files/latest/download?source=files
# tar zxvf ganglia-3.5.0.tar.gz
# cd ganglia-3.5.0
# ./configure --sysconfdir=/etc/ganglia/ --sbindir=/usr/sbin/ --enable-static-build
# make
# make install

Generate gmond config file


# gmond --default_config > /etc/ganglia/gmond.conf

If this error is thrown, you know how to resolve it :D

gmond: error while loading shared libraries: libpcre.so.1: cannot open shared object file: No such file or directory

Create a symlink and execute the command again


# ln -s /lib64/libpcre.so.0.0.1 /lib64/libpcre.so.1
# gmond --default_config > /etc/ganglia/gmond.conf

Copy the init script


# cd ganglia-3.5.0/gmond
# cp gmond.init /etc/init.d/gmond

Open the conf file


# vi /etc/ganglia/gmond.conf

This machine is also a part of MySQL cluster. You can create different clusters based on applications or as per your environment. The machine would automatically join the multicast channel defined under udp_send_channel.


cluster {
name = "MySQL"
owner = "Database TEAM"
latlong = "unspecified"
url = "unspecified"
}

Start the gmond service


# /etc/init.d/gmond start
# chkconfig --add gmond
# chkconfig gmond on

Point your browser to http://ipaddress-or-domainname/ganglia

ganglia03 300x176 Install ganglia monitoring system on CentOS, RHEL

ganglia04 300x176 Install ganglia monitoring system on CentOS, RHEL