[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Proper alert escalations question



The goal here is for the secondary on-call person to have a restful nights sleep, assuming that the primary oncall person fixes the problem that caused the alert.

* The primary should be alerted immediately upon a problem
* The primary should get a RECOVERY page when the alert recovers
* The secondary should ONLY get an alert if something is red for 30+ minutes
* The secondary should get a RECOVERY page, ONLY if they were alerted in the first place (for a +30min event).
* If the primary ACKs an alert, the secondary will not be emailed unless the ACK expires and the service is still red.


If I use a ruleset of:

HOST=www.foo.com SERVICE=http
MAIL primary_oncall (at) foo.com FORMAT=sms COLOR=red RECOVERED
MAIL secondary_oncall (at) foo.com FORMAT=sms COLOR=red DURATION>30 RECOVERED


I do not believe this will work? Because the secondary will still get woke up by the RECOVERED message? Or is Hobbit smart enough that it only sends RECOVERED messages to only MAIL recipients that have previously received an alert?