[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [hobbit] DURATION rules for specific host alerts

To: hobbit (at) hswn.dk
Subject: Re: [hobbit] DURATION rules for specific host alerts
From: "Gary Baluha" <gumby3203 (at) gmail.com>
Date: Fri, 22 Jun 2007 13:36:47 -0400
Dkim-signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=JxgSG+5rgYGNIq/E03mPC/o2zK9EzxqGTOLg+y9Wd017lXphXsYq70Ck6IbClPKA1sd7YN7dRZN5XxXEL6rpstaAqzy/z4OCydAUsAEtD9K63yRyf2R1r8/C+nXcC0hgWo6akQ8avrpRJXxyerlmt1xVstLnOgo1kyD2LAQBM2E=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=RIBJR78+Fbc/mR16vhkt0NNQWWA7aE2yzVglDGtSilWwRME+nwA+vp3E7JqbbYJmiTgcMKmDLb09dyUVr/eED/2CimQcJxE2pxkkZrcXZedaWcddaRl4+OUtvXqy0/PklCdT98bGDDyRYyeKuypKjY05+Vk7Q+FzwRwaSG6T+Cg=
References: <29f517690706220749u18a0abd5o8ea21d4b17d01e52 (at) mail.gmail.com> <467BE6E0.7070105 (at) weatherdata.com>

On 6/22/07, Daniel Bourque <dbourque (at) weatherdata.com> wrote:


 Why would you not want the status to change ? Such a history log is great
for troubleshooting.


I wouldn't want the status to change, because I'm essentially making it a
two-part threshold; one part based on the hard-and-true numeric value, and
another threshold based on the length of time.

if you don't want to be notified about it, just use this in the

hobbit-alerts.cfg

Page=x
    IGNORE HOST=foo SERVICE=cpu COLOR=red DURATION<5m


Ahh, that's the sort of hobbit-alerts rule that would work for me, at least
until (if?) there becomes a way to do what I'm looking for in
hobbit-clients.cfg.

if you don't want it to change the status color on the parent pages , then

use NOPROPYELLOW:cpu in the bb-hosts file.

if you REALLY don't want it to change status, increase the LOAD numbers in
the hobbit-clients.cfg file.


The problem is that it is only a problem if the load is _sustained_ for more
than 10 minutes or so.
If I set the red threshold to Y, and the load momentarily spikes to Y+1, it
isn't a problem.  But if I raise the threshold to Y+2 and now I get a
sustained load of Y+1, it would be a problem since I wouldn't get alerted.

Essentially, I'm looking for a sort of time-based hysteretic monitoring.

-Dan


Gary Baluha wrote:

Is there a [non-messy] way to set a DURATION rule for a specific host
alert?  Basically, what I'm thinking of is something like this:

In hobbit-clients.cfg
HOST=myhost
    LOAD 20 30 DURATION>5m

The effect being, the status of the "myhost" cpu alert will only change to
yellow/red if the load is above the appropriate threshold for more than 5
minutes.

There are a few hosts that occasionally will spike above the cpu load
thresholds, but only for a few minutes (usually around 5 min at most), and
then recover on its own.  However, I don't want to raise the thresholds,
because a sustained load (more than 10 minutes) at this level _is_ actually
a critical event.  It's just not critical if it is just a momentary spike.

My specific example is with cpu load, but it could be for other things
too, such as process counts, memory, or even in some situations, disk space.

References:
- DURATION rules for specific host alerts
  - From: Gary Baluha
- Re: [hobbit] DURATION rules for specific host alerts
  - From: Daniel Bourque

Prev by Date: Re: [hobbit] DURATION rules for specific host alerts
Next by Date: I'm getting the error "could not lock RRD"
Previous by thread: Re: [hobbit] DURATION rules for specific host alerts
Next by thread: I'm getting the error "could not lock RRD"
Index(es):
- Date
- Thread