[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [hobbit] Alert Rules - DURATION not working
On Wed, Feb 02, 2005 at 08:56:22AM -0500, Tom Georgoulias wrote:
> HOST=$FOUND_SYS
> MAIL broken (at) nandomedia.com SERVICE=procs COLOR=red DURATION>5
> REPEAT=5
>
> After I add this rule, I restart hobbit. I read on the list that
> restarting isn't necessary, but it has been my experience that changes
> made to hobbit-alerts.cfg do not always get put into effect unless
> hobbit is restarted.
It shouldn't be needed, but it doesn't harm.
> 2005-02-02 08:11:12 criteriamatch foundry01.nandomedia.com:procs
> (NULL):(NULL):procs
> 2005-02-02 08:11:12 failed minduration 0<300
OK
> 2005-02-02 08:16:12 Got page message from foundry01.nandomedia.com:procs
> 2005-02-02 08:16:12 0 alerts to go
And this looks suspicious.
What's supposed to happen is that after the alert is first reported to
the hobbitd_alert module, this module is supposed to keep track of
when the next alert is due (the REPEAT interval comes into play here),
and if no alerts are due then you get the "0 alerts to go" message.
So something messes up the timekeeping, and we never get around to
testing if the DURATION triggers after the first attempt.
[after looking over the code for 10 minutes]
I think I've got it, but there's been quite a few changes to various
bits so I dont want to send one-line fixes now. I'll come up with a
proper full package, which will also include fixes for many of the
other bugs that have been reported for beta6.
Henrik