SCRIPT alert problem with SERVICE=

Ron Rosenkoetter rrosenkoetter at tradebotsystems.com
Thu Jul 24 21:07:59 CEST 2008


I'm having problems understanding how matching is done by hobbit for
sending alerts via the SCRIPT command. I have separated out the services
I want to monitor into two different SCRIPT events, because we want to
monitor disk and connection alerts 24/7, but cpu, mem, tbntp, and procs
only during the day.

Here's the appropriate part of the hobbit-alerts.cfg file

PAGE=KC/KC-Test
   SCRIPT /home/hobbit/server/ext/mail-format-alert.py SERVICE=disk,conn
DURATION>1m REPEAT=20 COLOR=yellow,red RECOVERED TIME=W:0000:2359
   SCRIPT /home/hobbit/server/ext/mail-format-alert.py
SERVICE=cpu,mem,tbntp,procs DURATION>1m REPEAT=20 COLOR=yellow,red
RECOVERED TIME=W:0700:1700
   MAIL ron at whatever.com SERVICE=cpu,mem,tbntp,procs DURATION>1m
REPEAT=20 COLOR=yellow,red RECOVERED TIME=W:0000:2359
   MAIL ron at whatever.com SERVICE=conn,disk DURATION>1m REPEAT=20
COLOR=yellow,red RECOVERED TIME=W:0700:1700


However, whenever hobbit generates ANY alert it matches on both SCRIPT
lines, and sends the alert twice. It doesn't appear to care about the
SERVICE parameter.

Here's the results I get when running a test alert

../bin/bbcmd hobbitd_alert --test test2003 cpu 500
00016539 2008-07-24 13:52:16 send_alert test2003:cpu state Paging
00016539 2008-07-24 13:52:16 Matching host:service:page
'test2003:cpu:KC/KC-Test' against rule line 124
00016539 2008-07-24 13:52:16 *** Match with 'PAGE=KC/KC-Test' ***
00016539 2008-07-24 13:52:16 Matching host:service:page
'test2003:cpu:KC/KC-Test' against rule line 125
00016539 2008-07-24 13:52:16 *** Match with 'SCRIPT
/home/hobbit/server/ext/mail-format-alert.py SERVICE=conn,disk
DURATION>1m REPEAT=20 COLOR=yellow,red RECOVERED TIME=W:0000:2359' ***
00016539 2008-07-24 13:52:16 Script alert with command
'/home/hobbit/server/ext/mail-format-alert.py' and recipient
SERVICE=conn,disk
00016539 2008-07-24 13:52:16 Matching host:service:page
'test2003:cpu:KC/KC-Test' against rule line 126
00016539 2008-07-24 13:52:16 *** Match with 'SCRIPT
/home/hobbit/server/ext/mail-format-alert.py SERVICE=cpu,mem,tbntp,procs
DURATION>1m REPEAT=20 COLOR=yellow,red RECOVERED TIME=W:0700:1700' ***
00016539 2008-07-24 13:52:16 Script alert with command
'/home/hobbit/server/ext/mail-format-alert.py' and recipient
SERVICE=cpu,mem,tbntp,procs


Note that it works properly with MAIL; the service doesn't match in line
128 and hobbit doesn't generate a MAIL event.


00018965 2008-07-24 13:52:16 Matching host:service:page
'test2003:cpu:KC/KC-Test' against rule line 127
00018965 2008-07-24 13:52:16 *** Match with 'MAIL
ron at tradebotsystems.com SERVICE=cpu,mem,tbntp,procs DURATION>1m
REPEAT=20 COLOR=yellow,red RECOVERED TIME=W:0700:1700' ***
00018965 2008-07-24 13:52:16 Mail alert with command 'mail -s "Hobbit
[12345] test2003:cpu CRITICAL (RED)" ron at tradebotsystems.com'
00018965 2008-07-24 13:52:16 Matching host:service:page
'test2003:cpu:KC/KC-Test' against rule line 128
00018965 2008-07-24 13:52:16 Failed 'MAIL ron at tradebotsystems.com
SERVICE=conn,disk DURATION>1m REPEAT=20 COLOR=yellow,red RECOVERED
TIME=W:0000:2359' (service not in include list)


So what's going on with SERVICE and the SCRIPT event? Any help would be
appreciated.



More information about the Xymon mailing list