[Xymon] "Disable until change"
Japheth Cleaver
cleaver at terabithia.org
Tue Nov 3 19:42:33 CET 2015
I'd agree that disable is intended more as a human override about the
alertability of a host+service combo. The acknowledge functionality is
more in line with what it seems you're looking for: "It's still Yellow,
still keep track of things, but don't alert downstream unless something
explicitly wants to."
If the issue is with the nongreen page, I believe there should be a way
to remove ack'd items from that page (but it might require running a
second instance of xymongen just to spit out that page, potentially with
a BOARDFILTER in there to limit it further).
"Disable until Change" would be possible, but we'd need to store the
actual underlying color to compare the incoming report to, since
disabling works by overriding the color that was sent and forcing it
blue. "Unack on Change" works precisely because we still have a
meaningful current color to compare an incoming message to.
-jc
On 11/2/2015 4:21 PM, Novosielski, Ryan wrote:
> I personally do not think using disable is a good idea for unplanned
> problems. For one, if you use the reporting features, you will be
> mixing planned and unplanned downtime together. Disable is really for
> times when you know exactly what is going on with the system, and
> alerting is not needed/someone is watching the system manually. That's
> my take on it anyway, and what I tell the people that work with me.
>
> ____ *Note: UMDNJ is now Rutgers-Biomedical and Health Sciences*
> || \\UTGERS |---------------------*O*---------------------
> ||_// Biomedical | Ryan Novosielski - Senior Technologist
> || \\ and Health | novosirj at rutgers.edu <mailto:novosirj at rutgers.edu>-
> 973/972.0922 (2x0922)
> || \\ Sciences | OIRT/High Perf & Res Comp - MSB C630, Newark
> `'
>
> On Nov 2, 2015, at 18:59, John Thurston <john.thurston at alaska.gov
> <mailto:john.thurston at alaska.gov>> wrote:
>
>> We often use "disable until ok", but it was brought to my attention that
>> it has burned us from time to time. For example:
>>
>> Host foo is yellow on disk. But that's ok. We're going to allocate some
>> new storage for it in the next service window. The test is marked
>> "disable until ok". But before the service window arrives, something
>> chews up a whole bunch of disk and the now-red test continues to be blue
>> because the test is not yet ok.
>>
>> We sometimes use "acknowledge" for this function, but the non-green
>> screen can get kind of cluttered this way.
>>
>> Does anyone have a good way to fake "disable while status remains
>> unchanged"?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20151103/ce94195a/attachment.html>
More information about the Xymon
mailing list