For the purpose of this article, I am going to talk about handling events such as a clearing up swap.
First, let us look at some configuration of Nagios. We are going to define a command, then service acting on that command. Let us assume that the nagios install is in /usr/local/nagios.
Therefore, in /usr/local/nagios/ a few configuration files are key:
- /usr/local/nagios/etc/objects/commands.cfg - the command file where the checks are defined
- /usr/local/nagios/etc/hosts/*/hosts.cfg - the services file where the checks are defined for execution based on other directives in this file.
# 'check_local_swap' command definition
command_line $USER1$/check_swap -w $ARG1$ -c $ARG2$
This says that check_local_swap executes check_swap with a warning threshold of $ARG1 and a critical threshold or $ARG2
Next when defining a service for a host
use generic-service; Name of service template to use
host_name dbfacebook34b ; hostname
service_description SYS:Swap ; what shows up in alerts
check_period 24x7 ; threshold when to check (all the time)
max_check_attempts 4 ; threshold to check before marking state
event_handler handle-swap ; handle an event (another command)
normal_check_interval 5 ; in seconds
retry_check_interval 1 ; only try once before reporting the state
contact_groups itops ; contact group to send notifications to
notification_options w,u,c,r ; need to look this up for all defs
notification_interval 600 ; retry sending notifs every 8 mins
notification_period 24x7 ; keep sending them
check_command check_nrpe!check_local_swap!80%!55% ; execute the event handler and warn like hell
Lots of goodies as you can see. Let us look at the event handler
This means execute this script whenever any event for swap occurs (I decided to make this simple and not put a threshold on this).
What does handle_swap.pl do - well it’s a perl script that looks at free memory and if only a few 100K of swap is in use, swapoff -a; swapon -a;
In this case, it is a bit safe to do this. Why do this? Why not just turn of swap. I have talked in depth about this subject-but for a minor recap. Linux needs swap else, kswapd will freak out. Swap in DB's is bad so I clean it up automatically since O_DIRECT on my SAN is not an option.
Why not just run a cron job? Nagios keeps a log, I like to review what is happening from a central location, and nagios is freaking COOL.