xymon_4.3.0-RC1: possible lost alerts
Dominique Frise
dominique.frise at unil.ch
Fri Feb 11 18:04:20 CET 2011
Hi,
I think I found a bug in xymond_alert.c.
Lets say there is a page msg for hostA.serviceA and this alert will not
be processed immediately because of this part of code:
816 /*
817 * When a burst of alerts happen, we get lots
of alert messages
818 * coming in quickly. So lets handle them in
bunches and only
819 * do the full alert handling once every 10
secs - that lets us
820 * combine a bunch of alerts into one
transmission process.
821 */
822 if (nowtimer < (lastxmit+10)) continue;
823 lastxmit = nowtimer;
The main loop will then wait for a new msg from xymond (Want msg <num>,
startpos... etc).
Now if the next msg is a page recovery from the same hostA.serviceA,
the next processing of the active alerts (for loop) will then cleanup
the alert for hostA.serviceA without sending any alert.
Dominique
More information about the Xymon
mailing list