[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [hobbit] False alerts (was : Can't see my alert in the "info" column)



> >I think there are 2 false errors :
> >- for each 'foo.cpu' alert I got paged twice, with the same ACK code.
> >- I shouldn't have been paged between 11h05 and 12h05, nor 
> after 13h00, 
> >for 'foo.procs'
> 
> Could you try running "bbcmd hobbitd_alert --test foo cpu" ?

Of course :

$ $BBHOME/bin/bbcmd hobbitd_alert --test foo cpu
2005-02-15 14:59:22 Using default environment file ../etc/hobbitserver.cfg
Matching host:service:page 'foo:cpu:' against rule line 115:Matched
    *** Match with 'HOST=foo TIME=W:0900:1800' ***
Matching host:service:page 'foo:cpu:' against rule line 116:Matched
    *** Match with 'SCRIPT /tmp/alerte.sh SERVICE=*
EXSERVICE=disk,mem,procs' ***
Script alert with command '/tmp/alerte.sh' and recipient SERVICE=*
Matching host:service:page 'foo:cpu:' against rule line 117:Failed (min.
duration)
Matching host:service:page 'foo:cpu:' against rule line 118:Failed (color)
Matching host:service:page 'foo:cpu:' against rule line 119:Failed (time
criteria)

Here are lines 115 to 119 of my $BBHOME/etc/hobbit-alerts.cfg :

115 HOST=foo TIME=W:0900:1800
116         SCRIPT /tmp/alerte.sh SERVICE=* EXSERVICE=disk,mem,procs
117         SCRIPT /tmp/alerte.sh SERVICE=disk DURATION>5m REPEAT=2h
118         SCRIPT /tmp/alerte.sh SERVICE=mem COLOR=yellow REPEAT=24h
119         SCRIPT /tmp/alerte.sh SERVICE=procs TIME=*:1145:1150,*:1205:1300
REPEAT=24h

> Also, if you add the option "--cfid" to the hobbitd_alert 
> commandline in hobbitlaunch.cfg, it will include the 
> linenumber of the hobbit-alerts.cfg file with each alert. 
> That should make it easier to track down what rules trigger an alert.

Done.

> I just noticed this won't work for SCRIPT recipients, because it's put in
the message subject which scripts ignore. So drop that.

Undone ;-)

Regards,

-- 

Frédéric Mangeant