[Xymon] Alerting - I'm not doing it right...

Carl Inglis Carl.Inglis at rakon.com
Thu Dec 15 13:01:04 CET 2011


>

 Carl Inglis
Systems Administrator

Rakon UK Limited
Dowsett House, Sadler Road, Lincoln LN6 3RS, United Kingdom
Tel: +44 (0)1522 812630 | Fax:+44 (0) 1522 812664 | Mob: +44 (0) 7786 552915
Carl.Inglis at rakon.com | www.rakon.com
Winner of the 2010 Lincolnshire Business of the Year Award

This message together with any attachments contains confidential information and may be
subject to privilege. If you are not the intended recipient you may not distribute it in any
way, you must notify the sender immediately and delete any copies of the message along
with its attachments.
-----Original Message-----
> From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On
> Behalf Of henrik at hswn.dk
> Sent: 15 December 2011 11:36
> To: xymon at xymon.com
>
> On Thu, 15 Dec 2011 10:02:43 +0000, Carl Inglis <Carl.Inglis at rakon.com>
> wrote:
> > alerts.cfg
> >
> > $EMAIL_ALERT=carl.inglis at rakon.com
> > $LIN_WINDOWS_PROBLEMS=$EMAIL_ALERT
> >
> > HOST=%lin(.*) SERVICE=%win(.*)
> >         MAIL $LIN_WINDOWS_PROBLEMS REPEAT=24h DURATION>1d RECOVERED
> > STOP
> >
> > HOST=* EXPAGE=printers
> >         MAIL $EMAIL_ALERT REPEAT=1h RECOVERED UNMATCHED STOP
> >
> > When the host "lin-apps-01" has a yellow alert on it's "winUpdates"
> > services, I expect it to shout about it once every 24h. It is,
> > however, shouting about it once every hour.
>
> There may be some confusion about "service" here.
>
> When you refer to "winUpdates" - is that a status-column in Xymon, or a
> Windows Service that you are monitoring with a client on the Windows
> machine? The latter would typically show up in a "svcs" (services)
> status column on Xymon.

It's a status column that's returned by a BBWIN ext script- it goes yellow if there are pending Windows Updates on that server.

> The SERVICE=... setting in alerts.cfg refer to the status-column, not a
> Windows service. So to catch a "Windows updates" service that is not
> running, you would have 'SERVICE=svcs' in alerts.cfg.
>
> What the first part of your alerts.cfg says, is "if you have a host
> whose name contains 'lin', and that host has a status-column that
> contains 'win', then send an alert after 1 day, and repeat every 24
> hours".

Which is what I wanted it to do.

> The second part of your configuration says "Any status that has an
> error - except those on the 'printers' page, and those handled by other
> rules - trigger an alert that is repeated once an hour". Pretty broad
> definition, I think.

Indeed - I'm currently in development mode trying to finalise how we're going to do our alerting; the last line of the configuration was intended as a "you missed one" alert for me. There are a number of lines above the first line in my original email.

> Hope that removes a bit of confusion.

It does indeed, thank you.

It appears that removing the "DURATION>1d" option has stopped the second rule for firing - which would make sense since (as Johan suggested) the first rule is unmatched until the alert has a duration of more than one day.

Is that interpretation correct?

Thanks,

Carl


Rakon UK Ltd is a limited company registered in England and Wales.
Registered Office: Dowsett House, Sadler Road, Lincoln LN6 3RS
Company Registration Number: 5128090.

Please be aware that Rakon UK Limited may monitor email traffic data including the date, time, subject line, sender and recipients for the purposes of security and usage monitoring. Automated monitoring systems may also be applied to ascertain whether incoming/outgoing emails are likely to contain viruses, other destructive devices or inappropriate content.



More information about the Xymon mailing list