[Xymon] server-side message pre-processor?

Japheth Cleaver cleaver at terabithia.org
Tue Nov 3 19:22:48 CET 2015


On 11/3/2015 9:20 AM, John Thurston wrote:
> On 11/3/2015 6:11 AM, Henrik Størner wrote:
>> Den 03-11-2015 01:59, John Thurston skrev:
>>> Is there a mechanism through which I can dump certain host+test
>>> combinations to /dev/null ?
>>
>> Remove the host from your hosts.cfg, then Xymon will ignore updates from
>> it. Yes, it will still receive messages from it, but dropping them
>> immediately when the hostname is not found doesn't cost much processing.
>>
>> Too bad if you do want to monitor *some* elements, but dropping the host
>> entirely from Xymon at least gives you some bargaining power towards the
>> people who mis-manage their monitored server.
>
> Therein lies the rub. Xymon accepts all messages from a client, or it 
> accepts none.
>
> A host which has one business-critical test to report may also report 
> an infinite number of bits of garbage. All of that garbage fills my 
> logs and pushes everything else useful out of the "X events received 
> in the past Y minutes" portion of the non-green screen. I beat, 
> cajole, beg, but I can find no means by which to make an uncaring-host 
> owner clean up their act.

I guess one question would be whether these are being caused by poorly 
configured local settings, or by people coming up with new tests they're 
reporting in that you don't want to hear about?

Are you running in local config mode? If so, I might suggest using this 
as justification for migrating to central configuration. Put the 
thresholding control back into your hands unless the users are willing 
to accept responsibility for the types of things they're sending in.

> re: Spam mail alerts back to the offending host owner
> That will just add load to my alert-handler and mailer. The recipient 
> will create a mail-server-side rule to flush my noise to /dev/null. 
> This is a very ineffective whip unless the target email audience is 
> very large.
>
> It would be cool if there were a per-host "accept" tag. I could stick 
> it in a .default. line in hosts.cfg, "accept=disk,cpu,conn,http". Any 
> other test reported for the host would be dropped.

At first, I was thinking that this would be a great use for an extension 
to xymonproxy, allowing it to function as a full-fledged filter of sorts 
-- expose it on your normal submission port and let it forward on a 
filtered view to xymond. The problem is that the more inspection 
xymonproxy does, the slower it's going to run. And this becomes more 
pronounced with higher volumes and when dealing with extcombo or 
compressed messages where we'd have to unpack arbitrary messages first 
and reconstruct things before sending them on.


A whitelisted accept/deny at the xymond level (passed as an --option on 
the command line) wouldn't be too difficult to implement, but dealing 
with host level configurations means more text scanning for each message 
coming in. On the plus side, xymond has to do that anyway since it's the 
final destination for all message types, but it's unfortunate that the 
message couldn't be blocked before then (network/tcp/cpu, etc).

Is the problem more along the lines of "I don't want to receive test 
'xyz' on any host", or more "Here's a list of 38 different tests I want 
to reject on 18 / 400 servers". If it's the latter, a hosts.cfg text 
value (before an internal testname record is generated with which to 
assign accept/deny values to) might be the best option, performance 
concerns not withstanding.

> The underlying problem is Xymon assumes a friendly environment. Some 
> of my host-owners are not being friendly :(

Basically, yeah. There's a limit to how much can be done here without 
subverting the flexibility that's at the heart of how useful xymon can 
be. As a result, Xymon has a hosts.cfg, but no tests.cfg per se.

And even further-off features like SSL signing of messages (with a local 
cert controlled by appropriate unix privs) wouldn't alter the current 
assumption that someone with the authority to send one status message 
about a host had the authority to send any status message about that host.

As an initial step, I think adding hard-coded ignore-test records at 
xymond startup (by command option or by xymonserver.cfg) would probably 
be a pretty simple stop gap to create in the next rev.


HTH,
-jc



More information about the Xymon mailing list