[Xymon] suppress log contents from 'msgs' column ?

Jeremy Laidman jlaidman at rebel-it.com.au
Thu May 14 04:37:35 CEST 2015


On 14 May 2015 at 06:32, John Thurston <john.thurston at alaska.gov> wrote:

> To me, xymon/hobbit/BB are alerting tools. Their purpose is to tell me "A
> threshold you defined has been exceeded. You'd better go figure out if
> there is a problem brewing!" When Xymon has done this, it's job is done. I
> don't expect it to do much more.
>

Personally, Xymon is much more than for alerting.  It's also critical for
forensics.  When a fault has been detected, the graphs and snapshot reports
are extremely valuable for working out what historical factors may be
relevant to a fault.

Two ways I use Xymon for forensics:

1) If an event has a history, there might be a pattern that can enlighten
the cause (eg disk space problems at the start of every month) or a
coincident event (eg packet loss concurrent with a spike in disk I/O).

2) If a threshold measure has a short-term spike or a long-term slow
increase, then identifying when the metric started its incline can help pin
down the change or event that caused it.

"Go fix it" helps with the immediate problem and it's purpose is tactical,
for the short term.  But looking to the past can help prevent recurrence in
the future.

In the specific case of a CPU load fault, it can be valuable to know what
processes are new - in other words, what wasn't running 5 minutes before
the event, that was running after the event.  In some cases a new process
lifetime can be gleaned from the STIME column in the output of "ps -ef".
In other cases, it might be a process that is run from cron or inetd, or in
a while loop, and doesn't have a very long lifetime.  Or there might be a
situation where you have a clean-up process that has crashed, and you might
want to know what was running that is no longer. In reality, these are
somewhat contrived scenarios, and I have no concrete examples to prove that
it can happen.  But in your own words, it's "silly to think [we] can
predict all the information [we'll] need", and so in my opinion (and
experience) the more, the better.

If security is the problem, then secure the data.  Suppressing the data is
only one way to secure the data, and doing so can have down-sides.

In my deployment, I limit unauthenticated access to Apache, so those who
don't need to see my log files and process listings, don't get to see them,
but those who might benefit, can see them.

Cheers
Jeremy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20150514/b7e979a3/attachment.html>


More information about the Xymon mailing list