[hobbit] Alert Rules - DURATION not working

Henrik Stoerner henrik at hswn.dk
Wed Feb 2 22:10:11 CET 2005


On Wed, Feb 02, 2005 at 08:56:22AM -0500, Tom Georgoulias wrote:
> HOST=$FOUND_SYS
>         MAIL broken at nandomedia.com SERVICE=procs COLOR=red DURATION>5 
> REPEAT=5
> 
> After I add this rule, I restart hobbit.  I read on the list that 
> restarting isn't necessary, but it has been my experience that changes 
> made to hobbit-alerts.cfg do not always get put into effect unless 
> hobbit is restarted.

It shouldn't be needed, but it doesn't harm.

> 2005-02-02 08:11:12 criteriamatch foundry01.nandomedia.com:procs 
> (NULL):(NULL):procs
> 2005-02-02 08:11:12 failed minduration 0<300

OK

> 2005-02-02 08:16:12 Got page message from foundry01.nandomedia.com:procs
> 2005-02-02 08:16:12 0 alerts to go

And this looks suspicious.

What's supposed to happen is that after the alert is first reported to
the hobbitd_alert module, this module is supposed to keep track of
when the next alert is due (the REPEAT interval comes into play here),
and if no alerts are due then you get the "0 alerts to go" message.

So something messes up the timekeeping, and we never get around to
testing if the DURATION triggers after the first attempt.

[after looking over the code for 10 minutes]

I think I've got it, but there's been quite a few changes to various
bits so I dont want to send one-line fixes now. I'll come up with a
proper full package, which will also include fixes for many of the
other bugs that have been reported for beta6.


Henrik



More information about the Xymon mailing list