[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [hobbit] Yellow alerts, but no yellow config...



In <1110846746.16767.277.camel (at) localhost.localdomain> Daniel J McDonald <dan.mcdonald (at) austinenergy.com> writes:

>I'm finally trying to get alerts working with hobbit (RC5, no post
>patches, Linux 2.6.8.1-24mdksmp i686).  I'm getting paged on yellow.

First check if those are alert-messages or recovery messages. In
~hobbit/data/acks/notifications.log you should see log entries for
the messages you receive, like these:

Tue Mar 15 09:50:34 2005 backup-mx.post.tele.dk.smtp (195.41.53.68) henrik (at) hswn.dk 1110876634 725
Tue Mar 15 10:05:37 2005 backup-mx.post.tele.dk.smtp (195.41.53.68) henrik (at) hswn.dk 1110877537 725
Tue Mar 15 10:10:41 2005 backup-mx.post.tele.dk.smtp (195.41.53.68) henrik (at) hswn.dk 1110877840 725 4220

The first two are alerts, the last one is a recovery message (you can
see that by the extra number "4220" which is how long the service was
down).

RC5 has a known bug in the alert module, where it will send recovery
messages even if you never received an alert message.

>Here are the hobbitlaunch.cfg parameters:

Looks OK

>and the paging rules:
>HOST=ae-urps.aenetad.net
>        MAIL=dan.mcdonald (at) austinenergy.com,barry.allen (at) austinenergy.com REPEAT=24h RECOVERED
>
>HOST=%.*ups.*.austin-energy.net
>        MAIL=dan.mcdonald (at) austinenergy.com REPEAT=2h DURATION>10m SERVICE=freq COLOR="red" RECOVERED
>        MAIL=dan.mcdonald (at) austinenergy.com REPEAT=2h SERVICE=upsmin,upssec COLOR="red" RECOVERED
>
>HOST=%.*probe.*.austin-energy.net
>        MAIL=dan.mcdonald (at) austinenergy.com REPEAT=24h DURATION>20m COLOR="red" RECOVERED
>
>HOST=%.
>        MAIL=dan.mcdonald (at) austinenergy.com REPEAT=140h DURATION>30m RECOVERED COLOR="red" UNMATCED

I suspect your yellow alerts are recovery messages, and hence this is
the known bug in RC5.

BTW, the "UNMATCED" in your last rule must be a typo ...


Regards,
Henrik