[Xymon] alert question - duration?
Elizabeth Schwartz
betsy.schwartz at gmail.com
Tue Apr 5 04:06:14 CEST 2011
We had an alert that was yellow for several hours, then turned red.
It immediately paged *all* the way up the food chain. The rules
seem to be correct, see tests below; "alert1" through "alert4" are SMS aliases.
Does length of time an alert is yellow count towards the duration when
it turns red? And if so, can I change this, and/or is there a better
way?
In this case, it was a disk filling up... disks can often stay yellow
for several hours. Having a disk go from 94% full to 95% is something
we want to alert the tech on duty about, but not wake everyone up for.
Thanks much
Betsy
MAIL xymail REPEAT=1d RECOVERED # notify techops
MAIL ticket REPEAT=365d COLOR=yellow DURATION>20 # open ticket
MAIL alert1 REPEAT=10 COLOR=red,purple FORMAT=SMS # page onshift
or oncall at start RED, repeat every 10 minutes
MAIL alert2 DURATION>20 REPEAT=10 COLOR=red,purple FORMAT=SMS#
page secondary after 20 mins RED . repeat every 10 minutes
MAIL alert3 DURATION>40 REPEAT=10 COLOR=red,purple FORMAT=SMS#
page tertiary after 40 mins RED. repeat every 10mins
MAIL alert4 DURATION>60 REPEAT=10 COLOR=red,purple FORMAT=SMS#
page mgr after 60 mins RED. repeat every 10mins
--
(domain name removed)
[xymon at netmon2 etc]$ ../bin/xymond_alert --test mmf4 disk
--duration=5 |grep mail 00022750 2011-04-04 21:51:45 *** Match with
'MAIL xymail REPEAT=1d RECOVERED' *** 00022750 2011-04-04 21:51:45
Mail alert with command '/var/spool/mail/xymon "Xymon [12345]
mmf4:disk CRITICAL (RED)" xymail'
00022750 2011-04-04 21:51:45 Mail alert with command 'mail alert1'
[xymon at netmon2 etc]$ ../bin/xymond_alert --test mmf4 disk
--duration=15 |grep mail
00022752 2011-04-04 21:51:58 *** Match with 'MAIL xymail REPEAT=1d
RECOVERED' ***
00022752 2011-04-04 21:51:58 Mail alert with command
'/var/spool/mail/xymon "Xymon [12345] mmf4:disk CRITICAL (RED)"
xymail'
00022752 2011-04-04 21:51:58 Mail alert with command 'mail alert1'
[xymon at netmon2 etc]$ ../bin/xymond_alert --test mmf4 disk
--duration=25 |grep mail
00022754 2011-04-04 21:52:06 *** Match with 'MAIL xymail REPEAT=1d
RECOVERED' ***
00022754 2011-04-04 21:52:06 Mail alert with command
'/var/spool/mail/xymon "Xymon [12345] mmf4:disk CRITICAL (RED)"
xymail'
00022754 2011-04-04 21:52:06 Mail alert with command 'mail alert1'
00022754 2011-04-04 21:52:06 Mail alert with command 'mail alert2'
[xymon at netmon2 etc]$ ../bin/xymond_alert --test mmf4disk
--duration=65 |grep mail
00022767 2011-04-04 21:52:37 *** Match with 'MAIL xymail REPEAT=1d
RECOVERED' ***
00022767 2011-04-04 21:52:37 Mail alert with command
'/var/spool/mail/xymon "Xymon [12345] mmf4:disk CRITICAL (RED)"
xymail'
00022767 2011-04-04 21:52:37 Mail alert with command 'mail alert1'
00022767 2011-04-04 21:52:37 Mail alert with command 'mail alert2'
00022767 2011-04-04 21:52:37 Mail alert with command 'mail alert3'
00022767 2011-04-04 21:52:37 Mail alert with command 'mail alert4'
[xymon at netmon2 etc]$
More information about the Xymon
mailing list