[Xymon] Problem with TIME qualifier to PORT test

Jeremy Laidman jlaidman at rebel-it.com.au
Wed Jan 25 00:16:49 CET 2012


Oh wise people of the Xymon clan

I occasionally have logrotate spawn a post-rotate process that hangs
and needs to be manually killed.  This is easily detected because the
parent process also persists as "/bin/sh /etc/cron.daily/logrotate".
I want to detect when logrotate has been running for more than an
hour, and as it starts at 1am, I added the following line into my
hosts' analysis.cfg file:

        PROC "%^/bin/sh /etc/cron.daily/logrotate" 0 0
"TEXT=/etc/cron.daily/logrotate" TIME=*:0200:0100

My understanding is that this makes the rule apply only between 2am
and 1am, or in other words, forces "green" between 1am and 2am.

I added this line yesterday.

Alas, 2 out of 6 hosts showed a red "procs" alert at around 10 seconds
after 1am (when logrotate runs), for 5 minutes.  So it failed for me.

Curiously, two servers that have an ongoing problem with a hung
logrotate process (which inspired the check I'm trying to implement)
showed green for the hour from 1am to 2am, then went back to red.  So
this indicates that the time qualifier is being handled correctly by
those servers.

I can't figure out why it went red for the other servers that don't
have a hung logrotate process.  Any ideas?

Perhaps this is due to time lag between the client data delivery and
the analysis.  I wonder if I should change the timespec to
"TIME=*:0200:0055"?

Cheers
Jeremy



More information about the Xymon mailing list