7.15.3. Monitoring Status Using nagios

In addition to the scripts bundled with the software, there is a Ruby gem available with expanded checks and a mechanism to add custom checks. See https://github.com/continuent/continuent-monitors-nagios for more details.

Integration with Nagios is supported through a number of scripts that output information in a format compatible with the Nagios NRPE plugin. Using the plugin the check commands, such as check_tungsten_latency can be executed and the output parsed for status information.

The available commands are:

To configure the scripts to be executed through NRPE:

  1. Install the Nagios NRPE server.

  2. Start the NRPE daemon:

    shell> sudo /etc/init.d/nagios-nrpe-server start
  3. Add the IP of your Nagios server to the /etc/nagios/nrpe.cfg configuration file. For example:

  4. Add the Tungsten check commands that you want to execute to the /etc/nagios/nrpe.cfg configuration file. For example:

  5. Restart the NRPE service:

    shell> sudo /etc/init.d/nagios-nrpe-server start
  6. If the commands need to be executed with superuser privileges, the /etc/sudo or /etc/sudoers file must be updated to enable the commands to be executed as root through sudo as the nagios user. This can be achieved by updating the configuration file, usually performed by using the visudo command:

    nagios          ALL=(tungsten)  NOPASSWD: /opt/continuent/tungsten/cluster-home/bin/check*

    In addition, the sudo command should be added to the Tungsten check commands within the Nagios nrpe.cfg, for example:

    command[check_tungsten_online]=/usr/bin/sudo -u tungsten /opt/continuent/tungsten/cluster-home/bin/check_tungsten_online

    Restart the NRPE service for these changes to take effect.

  7. Add an entry to your Nagios services.cfg file for each service you want to monitor:

    define service {
            host_name database
            service_description     check_tungsten_online
            check_command           check_nrpe! -H $HOSTADDRESS$  -t 30 -c check_tungsten_online
            retry_check_interval    1
            check_period            24x7
            max_check_attempts      3
            flap_detection_enabled  1
            notifications_enabled   1
            notification_period     24x7
            notification_interval   60
            notification_options    c,f,r,u,w
            normal_check_interval   5

The same process can be repeated for all the hosts within your environment where there is a Tungsten service installed.