[Xymon] duration of MSG red status
Jeremy Laidman
jlaidman at rebel-it.com.au
Tue Oct 28 01:05:03 CET 2014
On 28 October 2014 09:58, Bill Arlofski <waa-hobbitml at revpol.com> wrote:
> Other ideas? Can I somehow hammer this square peg into a round hole?
>
You can create a dynamic file based on the logfile, and alert on that. For
example, in client-local.cfg, something like this:
log:`LOG=/tmp/zlic.status; M=$(date +%M); [ $(expr $M % 10) -ge 5 ] && rm
-f $LOG; grep "ArchivingAccountsLimit exceeded" /var/log/messages >> $LOG;
[ -s $LOG ] && echo "$LOG"`:4096
I'm assuming that /var/log/messages is rotated daily. What happens here is
that zlic.status will get the log entries from your current messages file
(updated every 5 minutes) appended to it. If there are no log entries,
then the filename is not echoed and Xymon will ignore it (and no alerts
possible).
The trick here is that the zlic.status file is emptied only every second
run (every 10 minutes) prior to appending the log entries. By shrinking the
file size, logfetch thinks the file has been rotated, zeroes its status,
and starts looking at the file from the beginning.
Note that if you get a log entry in your messages file just prior to
rotation, then you'll only get an alert between the time the message is
detected and the messages file is rotated, which could be only a few
minutes, or even not at all if the timing isn't favourable. So in other
words, this will generate an alert that persists until the next rotation of
messages, or messages in the last 0-24 hours. If you want to go for longer
than that, you could perhaps grep from the current and previous messages
file, so you're alerting on any messages in the last 24-48 hours.
Another way to do this is to use a "file:" definition, similarly creating a
status file and then alarming on the file's size (non-zero indicating an
alertable log entry). For example:
file:`LOG=/tmp/zlic.status; grep "ArchivingAccountsLimit exceeded"
/var/log/messages >> $LOG; echo $LOG`
Then in analysis.cfg, create a matching entry and alert on size>0. A
down-side to this approach is that you get a particularly unhelpful message
along the lines of "FILE /tmp/zlic.status red size >0".
A third and similar way to do this is to create a file that exists only if
the licencing log is not detected. Like so:
file:`LOG=/tmp/zlic.OK; grep "ArchivingAccountsLimit exceeded" >/dev/null
&& rm -f $LOG || touch $LOG; echo $LOG`
Then in analysis.cfg, create a matching entry and alert on "noexist".
Yet another way to do this is to use a pseudo-file to generate a status
message. For example:
file:`COL=green; MSG="licencing OK"; LOGS=$(grep "ArchivingAccountsLimit
exceeded" /var/log/messages); [ "$LOGS" ] && { COL=red; MSG="licencing
error"; }; echo "status ${MACHINE}.zlic $COL $(date) $MSG" | $XYMON $XYMSRV
@`
There is no output from this pseudo-file, so Xymon will not take any "file"
connotations from it and will simply ignore it, except for the side-effects
from the $XYMON command that's also run here. This is tantamount to having
a client-side ext script, and you may simply prefer to do that. But this
can be deployed centrally.
A few notes:
1) None of these specific examples have been tested, and may contain syntax
errors, but scriptlets like these have been used on production systems.
2) I deliberately avoided using colons and backticks, because they are
interpreted by the logfetch binary, and break the scriptlets.
3) These scriptlets take up to 15 minutes to start reporting after being
added to client-local.cfg. When I'm testing these sort of things, I like
to bring up a xymoncmd shell, and paste in the bits between the backticks,
and look for errors or unexpected output.
J
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20141028/533a4736/attachment.html>
More information about the Xymon
mailing list