[Xymon] Alert REPEAT not working in 4.3.15.

Johan Sjöberg Johan.Sjoberg at deltamanagement.se
Mon Feb 10 10:47:08 CET 2014



> -----Original Message-----
> From: Xymon [mailto:xymon-bounces at xymon.com] On Behalf Of
> henrik at hswn.dk
> Sent: den 10 februari 2014 10:22
> To: xymon at xymon.com
> Subject: Re: [Xymon] Alert REPEAT not working in 4.3.15.
> 
> Den 2014-02-10 8:18, Johan Sjöberg skrev:
> 
> > A while ago, we upgraded to 4.3.15. It seems like the alert repeat
> > setting isn't working, only the first alert is sent. We have an
> > on-call person that receives the first alert via SMS after 7 minutes.
> > It should then repeat every 15 minutes. The rest of the team gets
> > their first alert after 22 minutes.
> 
> [snip config]
> 
> > From the notification log:
> >
> > Mon Feb 10 05:43:15 2014 web01.apache2 (123.123.123.123)
> > alarms at domain.tld 1392007395 0
> >
> > Mon Feb 10 05:51:15 2014 web01.apache2 (123.123.123.123) 111111
> > 1392007875 0
> >
> > Mon Feb 10 06:05:17 2014 web01.apache2 (123.123.123.123) 222222
> > 1392008717 0
> >
> > Mon Feb 10 06:05:17 2014 web01.apache2 (123.123.123.123) 333333
> > 1392008717 0
> >
> > Mon Feb 10 06:05:17 2014 web01.apache2 (123.123.123.123) 444444
> > 1392008717 0
> >
> > Strangely though, it seems like it was working on Feb 5, which was
> > also after the upgrade. The only change done since then is the patch
> > for xymonnet, and don't see how this could affect the alerts?
> 
> There are no changes to how alerts work in neither 4.3.15 or 4.3.16.
> 
> I copied your configuration into a 4.3.16 system, and REPEAT is working fine
> here:
> 
> $ tail -f notifications.log
> Mon Feb 10 09:39:58 2014 webmail.hswn.dk.conn (0.0.0.0) root[3]
> 1392021598 500
> Mon Feb 10 09:46:16 2014 webmail.hswn.dk.conn (0.0.0.0) root-1[4]
> 1392021976 500
> Mon Feb 10 10:01:57 2014 webmail.hswn.dk.conn (0.0.0.0) root-1[4]
> 1392022917 500
> Mon Feb 10 10:01:57 2014 webmail.hswn.dk.conn (0.0.0.0) root-2[5]
> 1392022917 500
> Mon Feb 10 10:01:57 2014 webmail.hswn.dk.conn (0.0.0.0) root-3[6]
> 1392022917 500
> Mon Feb 10 10:01:57 2014 webmail.hswn.dk.conn (0.0.0.0) root-4[7]
> 1392022917 500
> Mon Feb 10 10:17:06 2014 webmail.hswn.dk.conn (0.0.0.0) root-1[4]
> 1392023826 500
> Mon Feb 10 10:17:06 2014 webmail.hswn.dk.conn (0.0.0.0) root-2[5]
> 1392023826 500
> Mon Feb 10 10:17:06 2014 webmail.hswn.dk.conn (0.0.0.0) root-3[6]
> 1392023826 500
> Mon Feb 10 10:17:06 2014 webmail.hswn.dk.conn (0.0.0.0) root-4[7]
> 1392023826 500
> 
> (my "root" recipient is your first recipient, the "root-X" are your "11111",
> "22222" etc. recipients).
> 
> You didn't list the history log for the web01.apache2 service. Are you sure
> that it was red all of the time? Any green status will reset the REPEAT
> interval, this could explain why you don't see it.
> 
> Running xymond_alert with the "--debug" option will log a lot of data about
> how alert messages are handled. It would be nice to have this if the problem
> re-occurs.
> 
> 
> Regards,
> Henrik
> 
> _______________________________________________
> Xymon mailing list
> Xymon at xymon.com
> http://lists.xymon.com/mailman/listinfo/xymon

If it wasn't red the whole time, the reciepients with 22 minutes delay wouldn't have received any alerts. It also happened for two different alerts during the night. I will check if I can reproduce it by forcing a red alert. Should I add the debug flag to tasks.cfg to enable it?

Regards,
Johan


More information about the Xymon mailing list