[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [hobbit] only alert if X number of hosts are already in error



> Something like

>    HOST=%(www.*).foo.com TEST=http COLOR=red COUNT>=5
>        MAIL someone (at) foo.com

> The "COUNT>=5" would then cause this rule to trigger only if there
> were 5 or more hosts named www.*.foo.com, whose http tests are red.
> You could even combine this with other criteria, say have a threshold of
> 5 during the daytime, and 10 during off-hours.

> I can foresee a problem in handling recovery-notifications for this kind
> of alerts, but that's something I'll have to think about.

> Would that be useful ?

That would seem extremely useful.  I was thinking about notifications as well, and my first thought was just to notify on every recovery (if you've selected RECOVERED).  That way you would know if a single host kept doing a down/up/down/up/down/up type of thing.

This would work alright in my environment since hosts that go into error for any length of time tend not to fix themselves anyways. :)

--
Bruce Z. Lysik  <blysik (at) shutterfly.com>


The information contained in this message (including any attachments) may be confidential. This message (including any attachments) is intended to be read only by the recipient(s) to whom it is addressed. If the reader of this message is not the intended recipient, you are on notice that any distribution of this message, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and/or Shutterfly by telephone at (650) 610-5200 and delete or destroy any copy of this message (including any attachments).