[hobbit] only alert if X number of hosts are already in error
Daniel J McDonald
dan.mcdonald at austinenergy.com
Mon Jun 20 15:14:59 CEST 2005
On Fri, 2005-06-17 at 08:01 +0200, Henrik Stoerner wrote:
> Something like
>
> HOST=%(www.*).foo.com TEST=http COLOR=red COUNT>=5
> MAIL someone at foo.com
>
> The "COUNT>=5" would then cause this rule to trigger only if there
> were 5 or more hosts named www.*.foo.com, whose http tests are red.
> You could even combine this with other criteria, say have a threshold of
> 5 during the daytime, and 10 during off-hours.
>
> I can foresee a problem in handling recovery-notifications for this kind
> of alerts, but that's something I'll have to think about.
>
> Would that be useful ?
The main place I would use it would be NTP alerts. If one router loses
NTP, I'm not terribly worried. If 10-20 of them all fail at once then I
know there is something really bad happening... Maybe both GPS clocks
lost sync and all 4 cesium backups failed, or ntp locked up on a core
router and I need to make fewer down-stream nodes dependent on that one.
I would also consider using it for purple alerts. I don't want
individual purples for most of my stuff, but if there are a lot of them
(>100) then I know I killed mrtg and I should page on that.
--
Daniel J McDonald, CCIE # 2495, CNX
Austin Energy
dan.mcdonald at austinenergy.com
More information about the Xymon
mailing list