[hobbit] hobbit-alerts.cfg: behaviour of TIME and DURATION together

Henrik Størner henrik at hswn.dk
Tue Feb 10 12:31:00 CET 2009


In <1233173020 at mknews.sslug.dk> "SebA" <spa at syntec.co.uk> writes:

>Bizarrely and somewhat contradictory to the behaviour below is the behaviour
>of DURATION well inside of the times specified with the TIME rule.  Is
>DURATION not reset when the colour of the alert changes???  That seems to be
>the only explanation for what I'm seeing (though it is early days to be
>certain).  Or, to put it another way, is DURATION the non-green DURATION,
>rather than the duration of being in a certain colour?

You are correct - DURATION is the time the status has been in a 
"potentially alerting state", i.e. yellow, red or purple.


>The config I currently have is:
> 
>$pg-sebsms=me AT mysms2emailprovider.com TIME=W:0845:2355
> 
>HOST=DbR1 SERVICE=Special
>     MAIL me AT work.co.uk COLOR=red DURATION>2 REPEAT=30 RECOVERED
>     MAIL $pg-sebsms COLOR=red DURATION>15 REPEAT=300 RECOVERED
> 
>I was hoping (and expecting) the above rules to only alert after 2 minutes
>and 15 minutes repectively of being red, given that COLOR=red is part of the
>rule.  I do, however, acknowledge that there may be (rare) cases where you
>would want to include the yellow time in the DURATION.  In which case, we
>really need REDDURATION, YELLOWDURATION and PURPLEDURATION rules.  Or
>perhaps just a way of specifying how you want the DURATION to be calculated
>in that rule: DURATIONTYPE=<NONGREEN|LASTCHANGE> (that's either or).  Or
>even more powerfully: DURATIONCALC=color[,color] (adds up the duration of
>being in these colour states).  (However, this could become resource
>intensive if you specify DURATIONCALC=red,yellow,purple,green or something!
>On the other hand, one only needs to check back as far as DURATION, rather
>than calculate the total time in these colour states.)


I agree that the way it works currently is not entirely what you would 
expect from the rules you have. What would probably be best was for Xymon
to calculate the duration based on the COLOR-settings defined for the
alert (so for your rules, it would mean the alert triggered 2 respectively
15 minutes after the status went red - and yellow-time was ignored).

The problem with that approach is that it breaks down when a status
wobbles between yellow and red - e.g. a disk that is filled to just around
the critical level: You could end up in a situation where you wouldn't
get any alerts because it didn't stay red long enough to exceed the color-
specific DOWNTIME setting.


But it would probably make more sense than the current modus operandi. 
I'll see what I can do about that.


Regards,
Henrik




More information about the Xymon mailing list