[Xymon] Comment on Flapping

henrik at hswn.dk henrik at hswn.dk
Thu Mar 31 15:29:14 CEST 2011


On Tue, 29 Mar 2011 22:54:25 -0400, Elizabeth Schwartz
<betsy.schwartz at gmail.com> wrote:
> First, let me say that this is very nifty.
> Flap detection makes folks look at things that they might have missed.

Glad you like it:-)

> It's driving the NOC folks **nuts** though. Acking the reds should
> stop them from paging, but the main page then stays red for a full
> half hour, even though the problem is completely fixed. IMHO it would
> be very useful to have a "release" or "ALL CLEAR"  button of some sort
> for flapping situations that have been dealt with. The NOC folks  hate
> red screens...

Well ... yes, I see your point but I am not sure I agree with it.

If your NOC folks are using the "critical view", then they can ack the
alert, and it's gone from their view. That is how I think it/they should
work :-)

I know a lot of sites use the "All non-green" view or even the full
overview pages for monitoring, and the ack won't change the color there. If
you must have a green display in that case, then you can disable the status
(make it "blue") for 30 minutes, and then it will return to the real status
after that half hour has passed. But of course, any errors during that
period will not show up until the disable-period expires.

There may be a third possibility that does what you're asking for. I think
(haven't tested it) that the new "modify" command would override a flapping
status. If you have a "disk" status on the "server1" host, then a command
like this

   xymon 127.0.0.1 "modify server1.disk green manual Disk cleanup
completed"

will override the normal status-color and force the status green with the
comment "Disk cleanup completed". The "manual" keyword is just a token to
identify this modification. However, a modification is only valid for 2
status-updates, so it won't handle the full 30-minute period. It wouldn't
be terribly difficult to modify xymond to allow modifiers to be valid for a
longer period of time.

This could easily be wrapped into the status display when a flapping
status is shown.


Regards,
Henrik




More information about the Xymon mailing list