[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [hobbit] False alerts
- To: hobbit (at) hswn.dk
- Subject: RE: [hobbit] False alerts
- From: Frédéric Mangeant <frederic.mangeant (at) steria.com>
- Date: Wed, 16 Feb 2005 15:28:11 +0100
- Organization: Steria
- Thread-index: AcUUHlLCx3yarIcuSyWEoPn+bgvGWgAE/pAg
Hi Henrik
> I've tried, but I cannot make this happen on my own setup.
>
> Could you send me the script you use for alerting, and the
> ~hobbit/data/ack/notifications.log file ?
Well, I moved to another server, on which I cleanly installed Hobbit 4.0-rc2
+ patches, and can't seem to reproduice the problem.
Anyway, here's my tiny paging script :
$ cat /tmp/alert.sh
#!/bin/sh
DATE=`date +%d/%m/%Y%t%H:%M:%S`
echo "$DATE $BBHOSTNAME.$BBSVCNAME = $BBCOLORLEVEL (ack : $ACKCODE,
recovered : $RECOVERED)" >> /tmp/alert.txt
I did some more testing, there seems to be 2 small problems :
1) Warning when the format of a script is missing
With this rule :
$ cat $BBHOME/etc/hobbit-alerts.cfg
HOST=fmangeant SERVICE=* EXSERVICE=procs,disk,mem,svcs REPEAT=24h
TIME=W:0900:1800 SCRIPT /tmp/alert.sh FORMAT=TEXT
HOST=fmangeant SERVICE=disk DURATION>2m SCRIPT /tmp/alert.sh
I get a warning :
$ $BBHOME/bin/bbcmd hobbitd_alert --test fmangeant disk
2005-02-16 15:22:03 Using default environment file
/BB/hobbit/server/etc/hobbitserver.cfg
2005-02-16 15:22:03 Ignoring SCRIPT with no recipient at line 2
Matching host:service:page 'fmangeant:disk:' against rule line 1:Failed
(service excluded)
Matching host:service:page 'fmangeant:disk:' against rule line 2:Failed
(min. duration)
If I add the format of the script, like this :
$ cat $BBHOME/etc/hobbit-alerts.cfg
HOST=fmangeant SERVICE=* EXSERVICE=procs,disk,mem,svcs REPEAT=24h
TIME=W:0900:1800 SCRIPT /tmp/alert.sh FORMAT=TEXT
HOST=fmangeant SERVICE=disk DURATION>2m SCRIPT /tmp/alert.sh FORMAT=text
$ $BBHOME/bin/bbcmd hobbitd_alert --test fmangeant disk
2005-02-16 15:22:54 Using default environment file
/BB/hobbit/server/etc/hobbitserver.cfg
Matching host:service:page 'fmangeant:disk:' against rule line 1:Failed
(service excluded)
Matching host:service:page 'fmangeant:disk:' against rule line 2:Failed
(min. duration)
2) Repeat interval not correctly taken into account
I tried to repeat an alert every 5 minutes :
$ cat $BBHOME/etc/hobbit-alerts.cfg
HOST=fmangeant SERVICE=* EXSERVICE=procs,disk,mem,svcs REPEAT=24h
TIME=W:0900:1800 SCRIPT /tmp/alert.sh FORMAT=TEXT
HOST=fmangeant SERVICE=disk DURATION>2m SCRIPT /tmp/alert.sh FORMAT=TEXT
HOST=fmangeant SERVICE=procs REPEAT=5m SCRIPT /tmp/alert.sh FORMAT=TEXT
$ $BBHOME/bin/bbcmd hobbitd_alert --test fmangeant procs
2005-02-16 15:23:59 Using default environment file
/BB/hobbit/server/etc/hobbitserver.cfg
Matching host:service:page 'fmangeant:procs:' against rule line 1:Failed
(service excluded)
Matching host:service:page 'fmangeant:procs:' against rule line 2:Failed
(min. duration)
Matching host:service:page 'fmangeant:procs:' against rule line 3:Matched
*** Match with 'HOST=fmangeant SERVICE=procs REPEAT=5m SCRIPT
/tmp/alert.sh FORMAT=TEXT' ***
Script alert with command '/tmp/alert.sh' and recipient FORMAT=TEXT
But I got paged every 30 minutes :
$ cat /tmp/alert.txt
16/02/2005 14:43:27 fmangeant.procs = red (ack : 145155, recovered : 0)
16/02/2005 15:13:30 fmangeant.procs = red (ack : 145155, recovered : 0)
Is it possible to use any repeat value ?
Thanks in advance.
Regards,
--
Frédéric Mangeant