[Xymon] localmode, got over-size message, truncating

Christoph Zechner zechner at vrvis.at
Thu Mar 10 11:18:13 CET 2022


I solved it!

I had to add and set "MAXMSG_CLIENT=1024" in /etc/xymon/xymonclient.cfg, 
restarted xymon-client and all the errors were gone.

Thanks again for your help!

Cheers
Christoph


On 09/03/2022 06:42, Christoph Zechner wrote:
> On 09/03/2022 00:04, Jeremy Laidman wrote:
>> On Tue, 8 Mar 2022 at 18:52, Christoph Zechner <zechner at vrvis.at 
>> <mailto:zechner at vrvis.at>> wrote:
>>
>>     It seems I celebrated prematurely, the errors are back in exactly the
>>     same way :-/
>>
>>     2022-03-08 08:47:19.321457 Got over-size message, truncating at 
>> 528383
>>     bytes (max: 524288)
>>     2022-03-08 08:47:19.339786 Dropping (more) garbled data
>>
>>     I don't understand where this limit 05 512 comes from, everything on
>>     the
>>     server checks out (2048 before, tried 4096 as well, no change).
>>
>>
>> I'm at a loss. If the xymond process is proven to have this set at 
>> 2048, then I see no reason why it would give that error message with 
>> that number.
>>
>> Unless it's referring to another message type and hence a different 
>> maximum setting? Perhaps take a look at xymond's environment again, 
>> but search for all MAXMSG_ variables. See which one is set to 512, and 
>> that might be the culprit. The defaults for these max values are all 
>> different, with only two of them defaulting to 512: MAXMSG_CLIENT, 
>> MAXMSG_CLICHG (reference: lib/xymond_buffer.c). But it's possible one 
>> of them has been set to 512.
> 
> Thanks, I tried that, but unfortunately, this did not help, since all 
> the values were set correctly, according to my config.
> 
>>
>> The only other thing I can think of is that you have two copies of 
>> xymond running, somehow with different values of MAXMSG_CLIENT. But I 
>> can't think how this could come about. And you've already killed off 
>> any rogue processes.
> 
> Right, that's not it either. :-/
> 
>>
>> Maybe run xymond in debug mode for one round of updates, until you get 
>> the "Got over-size message" and review the debug logs. This might 
>> provide enough additional detail to find out what's going on.
>>
>> Another approach to solve the problem (truncated client data message) 
>> is to modify the client script (eg xymonclient-linux.sh) to truncate 
>> the ps command output, so that the total message size is less, and 
>> hopefully fits within the max message size. This will mean that PROC 
>> checks might not work anymore (which is likely the case now). But the 
>> current state is that monitoring of the sections that come after [ps] 
>> are likely broken now. On Linux this is notably the [top] and [vmstat] 
>> sections of the client data message, that are used for the "cpu" 
>> status and several metrics for graphing. Maybe something like adding 
>> "head -1000" will cut it down to a reasonable size:
>>
>> echo "[ps]"
>> ps -Aww -o pid,ppid,user,start,state,pri,pcpu,time,pmem,rsz,vsz,cmd | 
>> head -1000
> 
> That's actually a gread idea and I modified the [ports] section, because 
> I know this is the culprit (running a proxy there and all the active 
> client connections were too much for xymon to handle.
> 
> I'm not interested in client connections anyway, I just want to monitor 
> my running programs and ports on that server, so I replaced the original
> 
> netstat -antuW 2>/dev/null
> netstat -antuT 2>/dev/null
> 
> with
> 
> netstat -tulpenW 2>/dev/null
> 
> (adding your "| head 1000" suggestion did not work, because it cut off 
> the list before it could reach the IPv6 interfaces and thus the ports 
> check was always red).
> 
> Now xymon works again, although this is just a workaround, because the 
> underlying problem of where exactly my messages got truncated, is still 
> to be found, but I can live with this solution.
> 
> Anyway, I very much appreciate your time and efforts, thank you very much!
> 
> Cheers
> Christoph
> 
>>
>> Also, review the client data message before the [ps] section to see if 
>> there's actually something else pushing it over the limit, and [ps] 
>> just happens to be where the truncation happens.
>>
>> J
>>
> _______________________________________________
> Xymon mailing list
> Xymon at xymon.com
> http://lists.xymon.com/mailman/listinfo/xymon


More information about the Xymon mailing list