[xymon] alerting with combo question

Henrik Størner henrik at hswn.dk
Tue Nov 30 13:16:49 CET 2010


In <alpine.LRH.2.00.1011301045100.29248 at pal34.desy.de> Martin Flemming <martin.flemming at desy.de> writes:

>I've tested it with the bbcombotest.cfg like

>AtlasLustre.lustre-atlas = (tcx040.lustre\-atlas + tcx060.lustre\-atlas + tcx080.lustre\-atlas + tcx120.lustre\-atlas )  >= 4

>This Alarm works of course, but i've got only this alert-message

>Red Mon Nov 29 14:37:07 2010

>(tcx040.lustre\-atlas+tcx060.lustre\-atlas+tcx080.lustre\-atlas+tcx120.lustre\-atlas)>=4 = (1+1+1+1)>=4 = 0
>&green tcx040.lustre-atlas
>&green tcx060.lustre-atlas
>&green tcx080.lustre-atlas
>&red tcx120.lustre-atlas


>Without the additional informations of all user-directory  and there size, 
>and thats logical of course .. but didn't solve my problem :-(


I'd use a script to handle the alerting in that case. You can grab the 
current status-data from Xymon using the bb 'hobbitdlog' command, so
you can include those data in your alert message. See 
http://www.xymon.com/xymon/help/xymon-alerts.html for details on alert
scripts.


Something like this - completely untested:


#!/bin/sh

# $BBALPHAMSG contains the alert message text. Save it to
# a file, then scan it for lines beginning with "&red" to get 
# the problem hosts. The grab the log-status for these hosts
# and append it to the alert message. Finally, send the
# alert.

echo "$BBALPHAMSG" >/tmp/alert.txt
egrep '^&red|^&yellow' /tmp/alert.txt | while read L
do
   LOGID=`echo $L | awk '{print $2}'`  # Get the host.status ID
   # Append the problem details to the alert text
   echo "$LOGID details"           >>/tmp/alert.txt
   $BB $BBDISP "hobbitdlog $LOGID" >>/tmp/alert.txt
done
# Send out the alert
mail -s "Lustre filesystem $BBCOLORLEVEL alert" $RCPT </tmp/alert.txt
exit 0


In hobbit-alerts.cfg, use

   HOST=AtlasLustre TEST=lustre-atlas
      SCRIPT /usr/local/bin/lustrealert.sh admin at foo.com


Regards,
Henrik




More information about the Xymon mailing list