[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [hobbit] alerts still not alerting



On Sat, Mar 19, 2005 at 10:33:09AM -0600, Daniel J McDonald wrote:
> I'm still flummoxed by hobbit-alerts.  I'm certain I broke something,
> because I am not getting any alerts from the box.

It's probably a config error ... 

> The only logs in /var/log/hobbit/page.log are 
> 2005-03-11 07:49:30 Tried to down BOARDBUSY: Invalid argument
> 2005-03-14 17:24:21 Tried to down BOARDBUSY: Invalid argument

These are harmless, and often occur when Hobbit is shutdown or
restarted.

> I see a couple of those in the hobbitlaunch.log file as well, I also see
> the following error:
> 2005-03-19 10:14:21 Task bbdisplay started with PID 7417
> 2005-03-19 10:14:21 Task bbretest started with PID 7418
> 2005-03-19 10:14:29 Our child has failed and will not talk to us
> 2005-03-19 10:14:36 Our child has failed and will not talk to us

That's a first - and you're right it should be more detailed in the
error-message. I've fixed that. But it generally means that one of the
hobbitd helper tasks has stopped responding.

> Here is a sample host that is not paging.  The info page lists:
> Service Recipient 1st Delay Stop after Repeat Time of Day Colors 
> conn dan.mcdonald (at) austinenergy.com (R) 30m  - 5d  - red 
> telnet dan.mcdonald (at) austinenergy.com (R) 30m  - 5d  - red
> 
> Both telnet and conn have been down on this host for over two hours.
> 
> The salient rule is:
> HOST=%.
>         MAIL=dan.mcdonald (at) austinenergy.com REPEAT=140h DURATION>30m
> RECOVERED COLOR="red" UNMATCHED

Your "HOST=" is wrong - it will only match hostnames with exactly one
letter (do you really have a host named "a" ?) - if you want to match
all hosts, then it's "HOST=%.*" or the simple form "HOST=*"

So some other rule must be generating the info-column output you
have, and therefore even if your HOST entry was correct, the rule
would not trigger because of the UNMATCHED restriction.

Could you try running

   exec ~hobbit/server/bin/bbcmd
   hobbitd_alert --test HOSTNAME conn "" 120 red

That should tell you how the alert is handled, and who gets notified
using what rules.


Regards,
Henrik