Various Swatch Severity Levels (last updated 07/20/01 by Alek)
Severity=HIGH (1-3)
1 # machine/critical service DOWN (from ping,email,ldap,lpd,mountd,nfs,slapd,syntax,web,etc.
1 *1 PowerChute initiated shutdown (via syslog if configured)
1 Sun Ultra internal overtemp (from syslog)
2 # gettemp reports HOT (from gettemp)
2 *2 steamd reports ERROR (from steamd/syslog if configured)
2 vxfs errors (from syslog)
3 *1 PowerChute detected Blackout (via syslog if configured)
3 Fibre Channel Problem (from syslog)
3 *3 Machine came back up (from /etc/rc script)
3 NAC came back up (from syslog - USUALLY generated)
3 NAC Disk Failure (from syslog)
3 # Problems with Tape-device from syslog
Severity=MEDIUM (4-6)
4 Disk/Tape read error/needs maintenance (from syslog)
5 # machine 5 min load factor > 20 from pingem
5 # gettemp reports WARM (from gettemp)
5 *1 PowerChute detected Bad Battery (via syslog if configured)
5 *2 steamd reports WARNING (via steamd/syslog if configured)
5 *4 NIS misconfiguration (/usr/local/share/bin/checkup.nis)
6 Power supply failures detected (from syslog)
6 swap space (or /tmp) exhausted (from syslog)
Severity=LOW (7-9)
7 NAC disk selection timeout (from syslog)
7 File System Full (from syslog)
8 Cleaning Tape issue (from syslog)
8 *1 PowerChute detected overload (via syslog if configured)
9 Various security events (from syslog)
9 svpconfig issue (from syslog & SVP wrapper)
9 *5 File System Almost Full (/usr/local/share/etc/check_disks)
# Requires pingem and/or gettemp to be monitoring host. Pingem stuff is added
by request to me. Note that different services above and beyond just pinging
the host are available. gettemp should be checking all Enterprise Servers
that are capable of reporting temperature - but let me know if I missed one.
* Requires tweek to local application - I've Emailed about these in the past;
but "*1" is appropriate PowerChute config and syslog setup, *2 is a few lines
added to the Steam agent, *3 is the reboot-notify script, *4 is run daily
via cron.daily on all hosts, and *5 is stuff that has to be run
manually and/or setup as a cron entry.