[hobbit] hobbit-alerts.cfg - DURATION

cits.bogajewski at daimlerchrysler.com cits.bogajewski at daimlerchrysler.com
Fri Mar 16 11:33:59 CET 2007


Hey guys,

thanks all for input.

So the initial question seems not be answered yet, but for the moment i am 
using workaround Something like Gary suggested. REPEAT=1d works fine, 
default would be 30 min. No DURATION Statement made. Anyway, it would be 
interesting to know definitely  DURATION is counted from each status 
change or from going into whatever alert state. At the moment i would 
guess DURATION count is not cleared in the way like REPEAT interval does.

2007-03-13 14:42:49 Severity increased, cleared repeat interval: 
myhost/disk yellow->red

So the very last question, is this a bug or feature?

Cheers,
Anatoli 

Manuel.Cortes at ORHS.ORG schrieb am 14.03.2007 23:29:52:

> We use DURATION in our case as a way to escalate notifications to 
> another group of recipients 15 minutes after the initial event 
> occurred in hobbit. The initial alert goes to our onsite Operations 
> folks then after 15 minutes, a custom script fires off that informs 
> all in that particular recipient group that the event is still 
> ongoing and it is being escalated.
> 
>     so: qpage pages Operations as soon as the event occurs and they 
> monitor the event
>          DURATION>15: the second script fires off.
> 
> Working pretty well so far....
> 
> Could REPEAT be used for further escalation? Or will another 
> DURATION>30 suffice?
> 
> My 2 cents :)
> 
> Manny
> 
>    -----Original Message----- 
>    From: Gary Baluha [mailto:gumby3203 at gmail.com] 
>    Sent: Wed 3/14/2007 3:13 PM 
>    To: hobbit at hswn.dk 
>    Cc: 
>    Subject: Re: [hobbit] hobbit-alerts.cfg - DURATION
> 
> 
>    Now that I think of it, if the goal is just to have the alert 
> send an email once, you probably just want to remove the REPEAT= 
> part (not sure if there is a default for this), or optionally change
> it to something like REPEAT=1d.  In that case, the DURATION isn't 
needed. 
> 
> 
>    On 3/14/07, Larry Barber <lebarber at gmail.com> wrote: 
> 
>       I think you have the inequality backwards on your DURATION 
> clause, as it is written no alert will be issued for for alerts that
> are older than 3 minutes, probably should be DURATION>3, not DURATION<3.
> 
>       Thanks, 
>       Larry Barber 
> 
> 
> 
>       On 3/14/07, cits.bogajewski at daimlerchrysler.com < cits.
> bogajewski at daimlerchrysler.com 
<mailto:cits.bogajewski at daimlerchrysler.com
> > > wrote: 
> 
>          Hello,
> 
>          thanks for ur reply. 
> 
>          gumby3203 at gmail.com schrieb am 13.03.2007 16:53:25:
> 
>          > it should be counting the time from when the alert changes 
status 
>          > (so, green-to-yellow, yellow-to-red, etc) 
> 
>          thought so
> 
>          > Try using the bbcmd "hobbitd_alert" test below to see if it 
is
>          > working as intended.  It can be used as below:
>          > /var/hobbit/server/bin/bbcmd hobbitd_alert --test 
> <hostname> <host test> 
> 
>          works in principle as expected, although there is no 
possibility to
>          reproduce my scenario using test utility
> 
>          > Also, you might want to consider using DURATION<3m 
(specifying "m"
>          > for minutes).  I'm not sure what the default is, but I 
personally 
>          > prefer to be explicit; makes reading it a little easier as 
well.
> 
>          from man pages: "The duration is specified as a number, 
_optionally_
>          followed by 'm' (minutes, default), 'h' (hours) or 'd' (days)." 

> 
>          --debug output of hobbitd_alert looks like:
> 
>          (initial alert yellow)
> 
>          2007-03-13 14:38:58 hobbitd_alert: Got message 1139
>          @@page#1139|1173793138.770212|xx.xx.xx.xx|myhost|disk|xx.
> xx.xx.xx|1173794938|yellow|green|1173793138|pct|643201||| 
>          2007-03-13 14:38:58 startpos 2590, fillpos 2590, endpos -1
>          2007-03-13 14:38:58 Got page message from myhost:disk
>          2007-03-13 14:38:58 Alert status changed from 0 to 1
>          2007-03-13 14:38:58 Found a first matching rule 
>          2007-03-13 14:38:58 No more secondary matching rule
>          2007-03-13 14:38:58 1 alerts to go
>          2007-03-13 14:38:58 Found a first matching rule
>          2007-03-13 14:38:58 send_alert myhost:disk state 0
>          2007-03-13 14:38:58 No more secondary matching rule 
>          2007-03-13 14:38:58 Want msg 1140, startpos 2590, fillpos 
> 2590, endpos -1,
>          usedbytes=0, bufleft=263649
>          2007-03-13 14:38:58 Found a first matching rule
>          2007-03-13 14:38:58   repeat myhost|disk|script|0123456789 at 0 

>          2007-03-13 14:38:58   Alert for myhost:disk to 0123456789
>          2007-03-13 14:38:58 Opening file 
/opt/hobbit/server/etc/bb-hosts
> 
>          (4min later red alert raises)
> 
>          2007-03-13 14:42:49 hobbitd_alert: Got message 1223 
>          @@page#1223|1173793369.998387|xx.xx.xx.xx|myhost|disk|xx.
> xx.xx.xx|1173795169|red|yellow|1173793369|pct|643201|||
>          2007-03-13 14:42:49 startpos 47243, fillpos 47243, endpos -1
>          2007-03-13 14:42:49 Got page message from myhost:disk 
>          2007-03-13 14:42:49 Severity increased, cleared repeat 
interval:
>          myhost/disk yellow->red
>          2007-03-13 14:42:49 Found no first matching rule
>          2007-03-13 14:42:49 Want msg 1224, startpos 47243, fillpos 
> 47243, endpos 
>          -1, usedbytes=0, bufleft=218996
> 
>          so hm. i am not sure i got any lines of interest, but this 
> looks not very
>          helpfull.
> 
>          Chears,
>          Anatoli
> 
> 
>          >
>          > Dear Hobbits,
>          >
>          > is DURATION keyword within hobbit-alerts.cfg relates to 
> the time period
>          > one test is in a special state, yellow or red, or more 
> general the time
>          > period since non-green state occurs. In example, i want 
> to get exact one
>          > notification at yellow state and one at red, but the 
following 
>          > configuration does not work. I get notified on initial 
> yellow alert, but
>          > not on red one occurring 4 min later.
>          >
>          > HOST=myhost SERVICE=disk COLOR=yellow DURATION<3
>          > SCRIPT $SSMSS $ABSMS REPEAT=5 RECOVERED 
>          >
>          > HOST=myhost SERVICE=disk COLOR=red DURATION<3
>          > SCRIPT $SSMSS $ABSMS REPEAT=5 RECOVERED
>          >
>          > Any ideas? Thanks :-)
>          >
>          > Mit freundlichen Grüßen / Yours sincerely
>          >
>          > Anatoli Bogajewski
>          >
>          > To unsubscribe from the hobbit list, send an e-mail to
>          > hobbit-unsubscribe at hswn.dk 
>          >
> 
> 
>          To unsubscribe from the hobbit list, send an e-mail to 
>          hobbit-unsubscribe at hswn.dk
> 
> 
> 
> 
> 
> 
> 
> This e-mail message and any attached files are confidential and are 
> intended solely for the use of the addressee(s) named above. If you 
> are not the intended recipient, any review, use, or distribution of 
> this e-mail message and any attached files is strictly prohibited. 
> This communication may contain material protected by Federal privacy
> regulations, attorney-client work product, or other privileges. If 
> you have received this confidential communication in error, please 
> notify the sender immediately by reply e-mail message and 
> permanently delete the original message.  To reply to our email 
> administrator directly, send an email to: 
> postmaster at orlandoregional.org .  If this e-mail message concerns a 
> contract matter, be advised that no employee or agent is authorized 
> to conclude any binding agreement on behalf of Orlando Regional 
> Healthcare by e-mail without express written confirmation by an 
> officer of the corporation. Any views or opinions presented in this 
> e-mail are solely those of the author and do not necessarily 
> represent those of Orlando Regional Healthcare.
> To unsubscribe from the hobbit list, send an e-mail to
> hobbit-unsubscribe at hswn.dk




More information about the Xymon mailing list