[Xymon] localmode, got over-size message, truncating
Christoph Zechner
zechner at vrvis.at
Fri Mar 11 06:25:14 CET 2022
On 10/03/2022 23:56, Jeremy Laidman wrote:
> Honestly, I can't work out how this happened. A review of the code - in
> as much as I can understand it, not being a C programmer - shows that
> there's only one place the MAXMSG_CLIENT parameter is used, and that's
> in xymond. In particular, it's not used in the xymon client (which is
> the only process that logs to xymonclient.log).
I also digged through the source code trying to find answers and since
I'm using local mode on my clients (thus utilising the xymond_client
binary), I think it makes sense (more or less).
>
> I can understand how it could have come about that xymond was loaded
> using xymonclient.cfg for its environment, thus applying the smaller
> size limit to incoming messages. But if this were the case, I can't work
> out how you would have seen MAXMSG_CLIENT=2048 in the running xymond
> process's environment.
My MAXMSG_CLIENT=2048 messages were always server-side (thanks to your
env command line showing me the current used options), I never even saw
that variable on my client, because it never got set. Only after I
manually added it to xymonclient.cfg, it started working as expected.
I think it classifies as a bug, but xymon's localmode is somewhat
undocumented (the binary for it is missing in the Debian package as
well, for example...) and in my opinion this should be documented somewhere.
Christoph
>
> So, I'm glad you worked out a solution. But I don't think we quite
> understand the cause.
>
> On Thu, 10 Mar 2022 at 22:41, Jeremy Laidman <jeremy at laidman.org
> <mailto:jeremy at laidman.org>> wrote:
>
> Great work Christoph.
>
> Sorry, it appears that I led you down the wrong path, asserting that
> it was a server-only setting in xymond. It would appear to be a
> client-side setting. This seems to be undocumented in the man page
> for xymonclient.cfg.
>
> J
>
> On Thu, 10 Mar 2022 at 21:18, Christoph Zechner <zechner at vrvis.at
> <mailto:zechner at vrvis.at>> wrote:
>
> I solved it!
>
> I had to add and set "MAXMSG_CLIENT=1024" in
> /etc/xymon/xymonclient.cfg,
> restarted xymon-client and all the errors were gone.
>
> Thanks again for your help!
>
> Cheers
> Christoph
>
>
> On 09/03/2022 06:42, Christoph Zechner wrote:
> > On 09/03/2022 00:04, Jeremy Laidman wrote:
> >> On Tue, 8 Mar 2022 at 18:52, Christoph Zechner
> <zechner at vrvis.at <mailto:zechner at vrvis.at>
> >> <mailto:zechner at vrvis.at <mailto:zechner at vrvis.at>>> wrote:
> >>
> >> It seems I celebrated prematurely, the errors are back
> in exactly the
> >> same way :-/
> >>
> >> 2022-03-08 08:47:19.321457 Got over-size message,
> truncating at
> >> 528383
> >> bytes (max: 524288)
> >> 2022-03-08 08:47:19.339786 Dropping (more) garbled data
> >>
> >> I don't understand where this limit 05 512 comes from,
> everything on
> >> the
> >> server checks out (2048 before, tried 4096 as well, no
> change).
> >>
> >>
> >> I'm at a loss. If the xymond process is proven to have this
> set at
> >> 2048, then I see no reason why it would give that error
> message with
> >> that number.
> >>
> >> Unless it's referring to another message type and hence a
> different
> >> maximum setting? Perhaps take a look at xymond's environment
> again,
> >> but search for all MAXMSG_ variables. See which one is set
> to 512, and
> >> that might be the culprit. The defaults for these max values
> are all
> >> different, with only two of them defaulting to 512:
> MAXMSG_CLIENT,
> >> MAXMSG_CLICHG (reference: lib/xymond_buffer.c). But it's
> possible one
> >> of them has been set to 512.
> >
> > Thanks, I tried that, but unfortunately, this did not help,
> since all
> > the values were set correctly, according to my config.
> >
> >>
> >> The only other thing I can think of is that you have two
> copies of
> >> xymond running, somehow with different values of
> MAXMSG_CLIENT. But I
> >> can't think how this could come about. And you've already
> killed off
> >> any rogue processes.
> >
> > Right, that's not it either. :-/
> >
> >>
> >> Maybe run xymond in debug mode for one round of updates,
> until you get
> >> the "Got over-size message" and review the debug logs. This
> might
> >> provide enough additional detail to find out what's going on.
> >>
> >> Another approach to solve the problem (truncated client data
> message)
> >> is to modify the client script (eg xymonclient-linux.sh) to
> truncate
> >> the ps command output, so that the total message size is
> less, and
> >> hopefully fits within the max message size. This will mean
> that PROC
> >> checks might not work anymore (which is likely the case
> now). But the
> >> current state is that monitoring of the sections that come
> after [ps]
> >> are likely broken now. On Linux this is notably the [top]
> and [vmstat]
> >> sections of the client data message, that are used for the
> "cpu"
> >> status and several metrics for graphing. Maybe something
> like adding
> >> "head -1000" will cut it down to a reasonable size:
> >>
> >> echo "[ps]"
> >> ps -Aww -o
> pid,ppid,user,start,state,pri,pcpu,time,pmem,rsz,vsz,cmd |
> >> head -1000
> >
> > That's actually a gread idea and I modified the [ports]
> section, because
> > I know this is the culprit (running a proxy there and all the
> active
> > client connections were too much for xymon to handle.
> >
> > I'm not interested in client connections anyway, I just want
> to monitor
> > my running programs and ports on that server, so I replaced
> the original
> >
> > netstat -antuW 2>/dev/null
> > netstat -antuT 2>/dev/null
> >
> > with
> >
> > netstat -tulpenW 2>/dev/null
> >
> > (adding your "| head 1000" suggestion did not work, because
> it cut off
> > the list before it could reach the IPv6 interfaces and thus
> the ports
> > check was always red).
> >
> > Now xymon works again, although this is just a workaround,
> because the
> > underlying problem of where exactly my messages got
> truncated, is still
> > to be found, but I can live with this solution.
> >
> > Anyway, I very much appreciate your time and efforts, thank
> you very much!
> >
> > Cheers
> > Christoph
> >
> >>
> >> Also, review the client data message before the [ps] section
> to see if
> >> there's actually something else pushing it over the limit,
> and [ps]
> >> just happens to be where the truncation happens.
> >>
> >> J
> >>
> > _______________________________________________
> > Xymon mailing list
> > Xymon at xymon.com <mailto:Xymon at xymon.com>
> > http://lists.xymon.com/mailman/listinfo/xymon
> <http://lists.xymon.com/mailman/listinfo/xymon>
>
More information about the Xymon
mailing list