[Xymon] server-side message pre-processor?
cleaver at terabithia.org
Tue Nov 3 19:22:48 CET 2015
On 11/3/2015 9:20 AM, John Thurston wrote:
> On 11/3/2015 6:11 AM, Henrik Størner wrote:
>> Den 03-11-2015 01:59, John Thurston skrev:
>>> Is there a mechanism through which I can dump certain host+test
>>> combinations to /dev/null ?
>> Remove the host from your hosts.cfg, then Xymon will ignore updates from
>> it. Yes, it will still receive messages from it, but dropping them
>> immediately when the hostname is not found doesn't cost much processing.
>> Too bad if you do want to monitor *some* elements, but dropping the host
>> entirely from Xymon at least gives you some bargaining power towards the
>> people who mis-manage their monitored server.
> Therein lies the rub. Xymon accepts all messages from a client, or it
> accepts none.
> A host which has one business-critical test to report may also report
> an infinite number of bits of garbage. All of that garbage fills my
> logs and pushes everything else useful out of the "X events received
> in the past Y minutes" portion of the non-green screen. I beat,
> cajole, beg, but I can find no means by which to make an uncaring-host
> owner clean up their act.
I guess one question would be whether these are being caused by poorly
configured local settings, or by people coming up with new tests they're
reporting in that you don't want to hear about?
Are you running in local config mode? If so, I might suggest using this
as justification for migrating to central configuration. Put the
thresholding control back into your hands unless the users are willing
to accept responsibility for the types of things they're sending in.
> re: Spam mail alerts back to the offending host owner
> That will just add load to my alert-handler and mailer. The recipient
> will create a mail-server-side rule to flush my noise to /dev/null.
> This is a very ineffective whip unless the target email audience is
> very large.
> It would be cool if there were a per-host "accept" tag. I could stick
> it in a .default. line in hosts.cfg, "accept=disk,cpu,conn,http". Any
> other test reported for the host would be dropped.
At first, I was thinking that this would be a great use for an extension
to xymonproxy, allowing it to function as a full-fledged filter of sorts
-- expose it on your normal submission port and let it forward on a
filtered view to xymond. The problem is that the more inspection
xymonproxy does, the slower it's going to run. And this becomes more
pronounced with higher volumes and when dealing with extcombo or
compressed messages where we'd have to unpack arbitrary messages first
and reconstruct things before sending them on.
A whitelisted accept/deny at the xymond level (passed as an --option on
the command line) wouldn't be too difficult to implement, but dealing
with host level configurations means more text scanning for each message
coming in. On the plus side, xymond has to do that anyway since it's the
final destination for all message types, but it's unfortunate that the
message couldn't be blocked before then (network/tcp/cpu, etc).
Is the problem more along the lines of "I don't want to receive test
'xyz' on any host", or more "Here's a list of 38 different tests I want
to reject on 18 / 400 servers". If it's the latter, a hosts.cfg text
value (before an internal testname record is generated with which to
assign accept/deny values to) might be the best option, performance
concerns not withstanding.
> The underlying problem is Xymon assumes a friendly environment. Some
> of my host-owners are not being friendly :(
Basically, yeah. There's a limit to how much can be done here without
subverting the flexibility that's at the heart of how useful xymon can
be. As a result, Xymon has a hosts.cfg, but no tests.cfg per se.
And even further-off features like SSL signing of messages (with a local
cert controlled by appropriate unix privs) wouldn't alter the current
assumption that someone with the authority to send one status message
about a host had the authority to send any status message about that host.
As an initial step, I think adding hard-coded ignore-test records at
xymond startup (by command option or by xymonserver.cfg) would probably
be a pretty simple stop gap to create in the next rev.
More information about the Xymon