[Xymon] Possible Memory Leak (?!) in Version Xymon 4.3.27-1.el6.terabithia

Peter Welter peter.welter at gmail.com
Wed Sep 28 10:27:55 CEST 2016


Hi Henrik, J.C.,

Thanks for your response.

It seems that valgrind is available for RHEL (see below) and now I wanted
to ask J.C. the following: "What do you want me to do?"

If I want to use the prebuild packages, and YES that would be preferable,
then can you supply me with a pre-compiles binary for xymond_rrd that has
all the options Henrik talked about? So I can replace this with the
currently installed image?

Or should I build a package my self to debug this issue?

Regards, Peter


[root at uhu-a xymon]# yum search valgrind

Loaded plugins: product-id, search-disabled-repos, security,
subscription-manager

==================================================================================================================================
N/S Matched: valgrind
==================================================================================================================================

devtoolset-1.1-*valgrind*-devel.i686 : Development files for *valgrind*

devtoolset-1.1-*valgrind*-devel.x86_64 : Development files for *valgrind*

devtoolset-1.1-*valgrind*-openmpi.i686 : OpenMPI support for *valgrind*

devtoolset-1.1-*valgrind*-openmpi.x86_64 : OpenMPI support for *valgrind*

devtoolset-2-eclipse-*valgrind*.noarch : *Valgrind* Tools Integration for
Eclipse

devtoolset-2-*valgrind*-devel.i686 : Development files for *valgrind*

devtoolset-2-*valgrind*-devel.x86_64 : Development files for *valgrind*

devtoolset-2-*valgrind*-openmpi.i686 : OpenMPI support for *valgrind*

devtoolset-2-*valgrind*-openmpi.x86_64 : OpenMPI support for *valgrind*

eclipse-*valgrind*.x86_64 : *Valgrind* Tools Integration for Eclipse

perl-Test-*Valgrind*.noarch : Generate suppressions, analyze and test any
command with *valgrind*

*valgrind*-devel.i686 : Development files for *valgrind*

*valgrind*-devel.x86_64 : Development files for *valgrind*

*valgrind*-openmpi.x86_64 : OpenMPI support for *valgrind*

devtoolset-1.1-*valgrind*.i686 : Tool for finding memory management bugs in
programs

devtoolset-1.1-*valgrind*.x86_64 : Tool for finding memory management bugs
in programs

devtoolset-2-*valgrind*.i686 : Tool for finding memory management bugs in
programs

devtoolset-2-*valgrind*.x86_64 : Tool for finding memory management bugs in
programs

*valgrind*.i686 : Tool for finding memory management bugs in programs

*valgrind*.x86_64 : Tool for finding memory management bugs in programs

valkyrie.x86_64 : Graphical User Interface for *Valgrind* Suite


  Name and summary matches *only*, use "search all" for everything.

2016-09-24 14:18 GMT+02:00 Henrik Størner <henrik at hswn.dk>:

> Hi,
>
> memory leaks are the worst to troubleshoot.
>
> If possible, then running xymond_rrd via the "valgrind" tool is the best
> way to do it. valgrind comes with some distributions, not sure about RHEL
> though. There might be some CentOS packages that will work.
>
> An important point is that the binaries must be compiled with debugging
> info intact; i.e. "-g" as a compile-time option, preferably only -O
> optimisation, and not stripped. I guess Japheth can help you with that, if
> necessary.
>
> Then you change the tasks.cfg to run xymond_rrd via valgrind: The CMD
> setting must then be
>
> CMD valgrind --log-file=/tmp/valgrind-rrd.%p --leak-check=full \
>     xymond_channel --channel=status --log=$XYMONSERVERLOGS/rrd-status.log
> xymond_rrd --rrddir=$XYMONVAR/rrd
> Then run Xymon normally for some time, until hopefully it starts logging
> memory leaks.
>
>
> This checking does have a significant performance impact, so running it on
> a 4000-server system is probably not possible.
>
>
> Regards,
> Henrik
>
>
>
> Den 23-09-2016 kl. 13:38 skrev Peter Welter:
>
> Hi Japheth,
>
> Probable one process (xymon_rrd) seems very hungry for memory:
>
> [xymon]# ps aux | egrep 'xymon|MEM'
>
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
>
> xymon    16889  0.0  0.0   4176   604 ?        S    13:26   0:00 /bin/dash
>
> xymon    16892  0.0  0.0   6272   660 ?        S    13:26   0:00 vmstat
> 300 2
>
> xymon    16986  0.0  0.0   4176   600 ?        S    13:28   0:00 /bin/dash
>
> xymon    16989  0.0  0.0   6272   664 ?        S    13:28   0:00 vmstat
> 300 2
>
> xymon    17060  0.0  0.0   4176   604 ?        S    13:30   0:00 /bin/dash
>
> xymon    17063  0.0  0.0   6272   664 ?        S    13:30   0:00 vmstat
> 300 2
>
> xymon    17107  0.5  0.1 140340 10324 ?        S    13:31   0:00
> /usr/bin/perl -w -I/home/bbtest/server/ext /etc/xymon/ext/netapp/netapp.pl
>
> xymon    17110  0.2  0.1 142236 11108 ?        S    13:31   0:00
> /usr/bin/perl -w -I/home/bbtest/server/ext /etc/xymon/ext/netapp/netapp.pl
>
> xymon    17160  0.0  0.0 106120  1248 ?        S    13:31   0:00 sh -c
> /usr/bin/ssh -x -l xymon xxx.xxx.xxx.xxx "environment status" 2>&1
>
> xymon    17161  0.0  0.0  60060  3440 ?        S    13:31   0:00
> /usr/bin/ssh -x -l xymon 10.10.1.30 environment status
>
> root     17163  0.0  0.0 103324   852 pts/1    S+   13:31   0:00 egrep
> xymon|MEM
>
> xymon    27932  0.0  0.0  12648   592 ?        Ss   Sep20   0:05
> /usr/sbin/xymonlaunch --log=/var/log/xymon/xymonlaunch.log
>
> xymon    27992  0.0  0.1 25212804 8160 ?       S    Sep20   1:57 xymond
> --restart=/var/lib/xymon/tmp/xymond.chk --checkpoint-file=/var/lib/xymon/tmp/xymond.chk
> --checkpoint-interval=600 --admin-senders=127.0.0.1,132.229.61.140
> --store-clientlogs=!msgs
>
> xymon    27996  0.0  0.0 12624444 1452 ?       S    Sep20   0:00
> xymond_channel --channel=stachg xymond_history
>
> xymon    27997  0.0  0.0 12624444 1244 ?       S    Sep20   0:00
> xymond_channel --channel=page xymond_alert --checkpoint-file=/var/lib/xymon/tmp/alert.chk
> --checkpoint-interval=600
>
> xymon    27998  0.0  0.0 12624444 1340 ?       S    Sep20   0:00
> xymond_channel --channel=client xymond_client
>
> xymon    27999  0.0  0.0 12624860 4328 ?       S    Sep20   0:02
> xymond_channel --channel=status xymond_rrd --rrddir=/var/lib/xymon/rrd
>
> xymon    28000  0.0  0.0 12625628 4712 ?       S    Sep20   0:00
> xymond_channel --channel=data xymond_rrd --rrddir=/var/lib/xymon/rrd
>
> xymon    28001  0.0  0.0 12624444 1320 ?       S    Sep20   0:00
> xymond_channel --channel=clichg xymond_hostdata
>
> xymon    28007  0.0  0.0  41788  1168 ?        S    Sep20   0:00
> xymond_channel --channel=user --log=/var/log/xymon/vmware-monitord.log
> vmware-monitord
>
> xymon    28008  0.0  0.0 10527268 1688 ?       S    Sep20   0:00
> xymond_history
>
> xymon    28009  0.0  1.5 12624884 122508 ?     S    Sep20   0:00
> xymond_client
>
> xymon    28010  0.0  0.0 106848  2176 ?        S    Sep20   0:00 /bin/gawk
> -f /usr/libexec/xymon/vmware-monitord
>
> xymon    28011  0.0  0.0 10527252 1212 ?       S    Sep20   0:00
> xymond_hostdata
>
> *xymon    28012  0.0  9.4 12680832 765216 ?     S    Sep20   0:08
> xymond_rrd --rrddir=/var/lib/xymon/rrd*
>
> *xymon    28013  0.0 12.1 12689484 975908 ?     S    Sep20   0:12
> xymond_rrd --rrddir=/var/lib/xymon/rrd*
>
> xymon    28014  0.0  0.1 10527512 9980 ?       S    Sep20   0:00
> xymond_alert --checkpoint-file=/var/lib/xymon/tmp/alert.chk
> --checkpoint-interval=600
> I did one test migration, were all hosts (about 4000 hosts) ran on this
> system. So the directory /var/lib/xymon/rrd is quite huge. However,
> currently there is only one host (xymon server itself) running and it is
> testing one netapp filer. So perhaps, xymon_rrd and this large directory
> are somehow related. I will have a try on the Accept environment which I
> have installed by now. There are just a few files in /var/lib/xymon/rrd on
> this Accept system, and I check next monday how each system will behave.
>
> <So far an update; will be continued. next week..>
>
>
> 2016-09-21 13:18 GMT+02:00 Peter Welter <peter.welter at gmail.com>:
>
>> Hi Japheth,
>>
>> Thanks for your response. I'm looking into this and will be back a.s.a.p.
>> (a few days or so, since I just restarted Xymon ;-)
>>
>> Peter
>>
>> 2016-09-20 19:07 GMT+02:00 Japheth Cleaver <cleaver at terabithia.org>:
>>
>>> On 9/20/2016 8:37 AM, Peter Welter wrote:
>>>
>>>> Hi J.C.,
>>>>
>>>> First of all: Thanks for your work for Xymon!
>>>>
>>>> Second: I have a question about the repository from terabithia. I want
>>>> to install an Development, Test  Accept, Production environment with the
>>>> use of this repository. I installed first and are working on the next phase.
>>>>
>>>> Over time however, I see that my Xymon-server seems to eat all the
>>>> memory available and starts swapping until all memory is consumed?!?
>>>>
>>>> This is for Development only and there are no really any tests. A very
>>>> small host.cfg. So, why is over time, Xymon this hungry for memory?
>>>>
>>>> Tue Sep 20 17:29:46 CEST 2016 - Memory CRITICAL
>>>>
>>>>    Memory                  Used       Total  Percentage
>>>> green Real/Physical          7737M       7872M 98%
>>>> yellow Actual/Virtual         7539M       7872M 95%
>>>> red Swap/Page              3886M       4095M         94%
>>>>
>>>> After a Xymon restart, all the swap is freed?
>>>>
>>>> I'm using Red Hat Enterprise Linux Server release 6.8 (Santiago)
>>>>
>>>> Any suggestions what to do next? Thanks in advance for any help!
>>>>
>>>> Peter
>>>>
>>>
>>> Hi Peter,
>>>
>>> I'm not aware of any memory leaks present in 4.3.27 itself that would
>>> cause growth like that. Can you provide the ps output for the system's
>>> various xymon tools? Which process seems to be running out of control?
>>>
>>> -jc
>>>
>>
>>
>
>
> _______________________________________________
> Xymon mailing listXymon at xymon.comhttp://lists.xymon.com/mailman/listinfo/xymon
>
>
>
> _______________________________________________
> Xymon mailing list
> Xymon at xymon.com
> http://lists.xymon.com/mailman/listinfo/xymon
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20160928/e8947b1a/attachment.html>


More information about the Xymon mailing list