| Subcribe via RSS

Knowing your PERC 6/i BBU

February 5th, 2010 | 1 Comment | Posted in Nagios, Performance, Uncategorized

I’ve recently become supremely disappointed in the availability of Nagios checks for RAID cards. Too often, I see administrators rely on chance (or their hosting provider) to discover failed drives, a dying BBU, or a degrading capacity on their RAID cards. So I began work on check_raid (part of check_mysql_all) to provide a suite of checks. One of the first cards I wanted to support was the PERC 6/i, so I scoured the documentation, forums, and picked the brains of my friends before finally getting on a marathon 4 hour call with Dell support. I’ll now share the interesting things that I’ve learned.

More »

Tags: , , ,

Nagios Checks For MMM

July 15th, 2009 | No Comments | Posted in MySQL, MySQL Administration

I’ve written some new Nagios checks for MMM (MMM on Google CodeMMM on Launchpad). check_mmm is a part of http://code.google.com/p/check-mysql-all/, and is meant to be called locally on the MMM Monitor server (usually via NRPE). Feedback is welcome, usage is as follows:

Usage:
     check_mmm --cluster C# 

     Options:
       --cluster=    The MMM Cluster to check
       -c, --critical=
    The level at which a critical alarm is raised.
       -h, --help                Display this message and exit
       -v, --verbose             Increase verbosity level
       -V, --version             Display version information and exit
       -w, --warning             The level at which a warning is raised.

     Defaults are:

     ATTRIBUTE                  VALUE
     -------------------------- ------------------
     cluster                    No default value
     critical                   HARD_OFFLINE,REPLICATION_FAIL
     help                       FALSE
     verbose                    1 (out of 3)
     version                    FALSE
     warning                    ADMIN_OFFLINE,AWAITING_RECOVERY,REPLICATION_DELAY
Tags: , ,

Nagios MySQL Plug-Ins

March 9th, 2009 | 4 Comments | Posted in MySQL, MySQL Administration, Nagios

There currently exist many plugins for MySQL to use with Nagios. Many of them, however, are not version-independent, leaving organizations that use multiple versions of MySQL to either install multiple plugins or not monitor specific versions of MySQL. As such, I’ve compiled what I consider to be the most useful checks into a single plugin: check_mysql

Usage:
     check_mysql check_name [options]

     Options:
       --args|a     Optional arguments.  Comma-separated.  Check-specific.
       --critical|c The level at which a critical alarm is raised.  Check-specific.
       --database   The database to use (defaults to mysql)
       --help|?     Display this message and exit
       --hostname|H     The target MySQL server host (defaults to localhost)
       --password|p The password of the MySQL user
       --port       The port MySQL is listening on (defaults to 3306)
       --user|u     The MySQL user used to connect
       --version|V  Display version information and exit
       --warning|w  The level at which a warning is raised.  Check-specific.

     defaults are:

     ATTRIBUTE                  VALUE
     -------------------------- ------------------
     args                       No default value
     critical                   Check-specific
     database                   mysql
     help                       FALSE
     host                       localhost
     password                   No default value
     port                       3306
     timeout                    10 seconds
     user                       No default value
     verbose                    1 (out of 3)
     version                    FALSE
     warning                    Check-specific

Current Checks Supported:

* connect – Check to see whether or not one can connect to MySQL (USAGE)
* repl_io – Check to see whether on not the IO Replication thread is running (REPLICATION CLIENT)
* repl_sql – Check to see whether or not the SQL Replication thread is running (REPLICATION CLIENT)
* repl_sbm – Check how many seconds behind the master the slave is (REPLICATION CLIENT)
* mysql_query – Run a given query, test if it executes properly (SELECT)
* connections – Test if the percentage of used connections is over a given threshold (PROCESS)

I am open to requests for additional checks etc.

Tags: ,

Nagios3 on CentOS: Quick Install Script

January 3rd, 2009 | 10 Comments | Posted in Random Tech

As part my job, I find myself doing Nagios installs on a somewhat regular basis. The following is a quick guide on installing Nagios 3 on CentOS, distilled from the official Nagios docs. It is meant to be copied and run as a shell script (you should only have to update the passwords):

#!/bin/sh

# Any Failing Command Will Cause The Script To Stop
set -e

# Treat Unset Variables As Errors
set -u

echo "***** Starting Nagios Quick-Install: " `date`
echo "***** Installing pre-requisites"
yum -y install httpd
yum -y install gcc
yum -y install glibc glibc-common
yum -y install gd gd-devel

echo "***** Setting up the environment"
useradd -m nagios
echo "INSERT_PASSWORD_HERE" |passwd --stdin nagios
groupadd nagcmd
usermod -a -G nagcmd nagios
usermod -a -G nagcmd apache

echo "***** Getting the Nagios Source and Plug-Ins"
cd /usr/local/src
wget http://prdownloads.sourceforge.net/sourceforge/nagios/nagios-3.2.0.tar.gz
wget http://prdownloads.sourceforge.net/sourceforge/nagiosplug/nagios-plugins-1.4.14.tar.gz
tar xzf nagios-3.2.0.tar.gz
tar xzf nagios-plugins-1.4.14.tar.gz

echo "***** Installing Nagios"
cd /usr/local/src/nagios-3.2.0
./configure --with-command-group=nagcmd
make all
make install
make install-init
make install-config
make install-commandmode
make install-webconf

echo "***** Setting up htpasswd auth"
htpasswd -nb nagiosadmin INSERT_PASSWORD_HERE > /usr/local/nagios/etc/htpasswd.users
service httpd restart

echo "***** Setting up Nagios Plug-Ins"
cd /usr/local/src/nagios-plugins-1.4.14
./configure --with-nagios-user=nagios --with-nagios-group=nagios
make
make install

echo "***** Fixing SELinux"
chcon -R -t httpd_sys_content_t /usr/local/nagios/sbin/
chcon -R -t httpd_sys_content_t /usr/local/nagios/share/

echo "***** Starting Nagios"
chkconfig --add nagios
chkconfig nagios on
service nagios start

echo "***** Done: " `date`

Hope you find this useful!

* EDIT: 2009-09-07 – Updated link to get Nagios 3.2.0
* EDIT: 2009-10-14 – Updated link to get Nagios Plugins 1.4.14

Tags:

Using Nagios as a MySQL Performance Profiler

June 17th, 2008 | 2 Comments | Posted in MySQL, MySQL Performance

Everybody knows than Nagios can be used as a service monitor to monitor things like Load Averages, MySQL Replication Status, RAID Array States, etc… Fewer know that there are plug-ins to monitor MySQL Performance Status, such as check_mysql_perf. Fewer still utilize Nagios’ built-in triggering mechanism to execute an additional script on the event of a critical alert.

It is not uncommon to experience a load spike in the middle of the night, only to discover that it immediately cleared itself. Many times, retroactive log analysis will not reveal anything out of the ordinary. In order to get an immediate snapshot of what is going on at the time of a Nagios alert, simply call a script via the event handler similar to this one:

#!/usr/local/bin/bash

Usage() {
echo 1>&2 "Usage: $0 -h <hostname> -u <username> -p <password> -s <state> -t <state_type>"
exit 1
}

MYSQL="/usr/local/bin/mysql"
MAIL="/usr/bin/mail"
RECIPS="dba@domain.com"
PERL="/usr/bin/perl"
[ $# -ne 10 ] && Usage

while getopts "h:u:p:s:t:" option; do
case $option in
h ) HOSTNAME="$OPTARG";;
u ) USERNAME="$OPTARG";;
p ) PASSWORD="$OPTARG";;
s ) STATE="$OPTARG";;
t ) STATE_TYPE="$OPTARG";;
* ) Usage;; # DEFAULT
esac
done

[ $STATE = "CRITICAL" -a $STATE_TYPE = "HARD" ] || exit 0
$MYSQL -h$HOSTNAME -u$USERNAME -p$PASSWORD --connect_timeout=10 -ss -e'SHOW FULL PROCESSLIST' | $PERL -pe 's/\\n/\n/g' | $MAIL -s"$HOSTNAME post-alert processlist" $RECIPS
exit 0

This will email dba@domain.com with the output of SHOW FULL PROCESSLIST;

This type of profiling is especially useful because it can easily be turned on and off without affecting your production site (and it is very inexpensive to execute). Don’t forget to grant your Nagios user the PROCESS privilege!

Tags: , ,