[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

yellow vs. red alerts and DURATION tag--looking for workaround



Hi,

This issue has been brought up by other users, for example:  http://www.hswn.dk/hobbiton/2007/08/msg00097.html

I have a very similar issue: For all yellow alerts, we want to send email only.  For red alerts, we want to first page a primary oncall person only, then if a test is still red after 20 min, we want to page a secondary oncall person.
Example config:
         MAIL email                COLOR=yellow,purple
        MAIL page.primary     COLOR=red
        MAIL page.secondary COLOR=red DURATION>20m
This works for most tests, but with disk and cpu services (which are likely to be in the yellow state for >20m before going red), the end result is that usually the primary and secondary get paged simultaneously when disk or cpu tests go red.
So, if a disk is gradually filling up and goes to the yellow state at midnight (and email is sent but nobody reads it at that hour), then goes to the red state at 3:30am, both the primary and secondary oncall people get paged at 3:30am instead of the primary getting paged at 3:30am and the secondary at 3:50am.
Can you think of any workaround to get the alerts to behave in the manner we desire?  (That is, the secondary only gets paged 20m after a test has gone red no matter how much time the same test has been yellow.)  I tried this:
        MAIL page.secondary COLOR=red,!yellow DURATION>20m
in hopes that pages to the secondary wouldn't take into account time a test had been in the yellow state, but this didn't work.

       
---------------------------------
Catch up on fall's hot new shows on Yahoo! TV.  Watch previews, get listings, and more!