[Xymon] Possible Memory Leak (?!) in Version Xymon 4.3.27-1.el6.terabithia

Japheth Cleaver cleaver at terabithia.org
Wed Sep 28 18:58:18 CEST 2016


Hi,

There's no need to rebuild the packages to enable this type of testing. 
Just make sure the xymon-debuginfo RPM is installed (it's in the same 
repo), as that contains all of the symbol information on RH-type systems.

As far as valgrind, all you really need is the base 'valgrind' package. 
Simply modify the tasks.cfg as below and you should be set. I also use 
"--track-origins=yes" typically.

In terms of the overall problem, xymond_rrd will use a larg(ish) amount 
of RAM as it spools up its cache of data points before sending them out 
to rrdtool itself for writing. In theory, this should hit a constant 
level once it's been running for an hour or two (depending on your 
datapoints and hosts) and shouldn't grow beyond that. The overall memory 
usage will scale linearly with host x RRAs.

I know it had been a source of leaks before, so it's possible something 
is still in there. Are you adding and removing lots of hosts at once by 
any chance? It's possible there's an incorrect cleanup of previously 
cached data, but I'd thought those had been resolved.

HTH,
-jc

On 9/28/2016 1:27 AM, Peter Welter wrote:
>
> Hi Henrik, J.C.,
>
> Thanks for your response.
>
> It seems that valgrind is available for RHEL (see below) and now I 
> wanted to ask J.C. the following: "What do you want me to do?"
>
> If I want to use the prebuild packages, and YES that would be 
> preferable, then can you supply me with a pre-compiles binary for 
> xymond_rrd that has all the options Henrik talked about? So I can 
> replace this with the currently installed image?
>
> Or should I build a package my self to debug this issue?
>
> Regards, Peter
>
>
> [root at uhu-a xymon]# yum search valgrind
>
> Loaded plugins: product-id, search-disabled-repos, security, 
> subscription-manager
>
> ================================================================================================================================== 
> N/S Matched: valgrind 
> ==================================================================================================================================
>
> devtoolset-1.1-*valgrind*-devel.i686 : Development files for *valgrind*
>
> devtoolset-1.1-*valgrind*-devel.x86_64 : Development files for *valgrind*
>
> devtoolset-1.1-*valgrind*-openmpi.i686 : OpenMPI support for *valgrind*
>
> devtoolset-1.1-*valgrind*-openmpi.x86_64 : OpenMPI support for *valgrind*
>
> devtoolset-2-eclipse-*valgrind*.noarch : *Valgrind* Tools Integration 
> for Eclipse
>
> devtoolset-2-*valgrind*-devel.i686 : Development files for *valgrind*
>
> devtoolset-2-*valgrind*-devel.x86_64 : Development files for *valgrind*
>
> devtoolset-2-*valgrind*-openmpi.i686 : OpenMPI support for *valgrind*
>
> devtoolset-2-*valgrind*-openmpi.x86_64 : OpenMPI support for *valgrind*
>
> eclipse-*valgrind*.x86_64 : *Valgrind* Tools Integration for Eclipse
>
> perl-Test-*Valgrind*.noarch : Generate suppressions, analyze and test 
> any command with *valgrind*
>
> *valgrind*-devel.i686 : Development files for *valgrind*
>
> *valgrind*-devel.x86_64 : Development files for *valgrind*
>
> *valgrind*-openmpi.x86_64 : OpenMPI support for *valgrind*
>
> devtoolset-1.1-*valgrind*.i686 : Tool for finding memory management 
> bugs in programs
>
> devtoolset-1.1-*valgrind*.x86_64 : Tool for finding memory management 
> bugs in programs
>
> devtoolset-2-*valgrind*.i686 : Tool for finding memory management bugs 
> in programs
>
> devtoolset-2-*valgrind*.x86_64 : Tool for finding memory management 
> bugs in programs
>
> *valgrind*.i686 : Tool for finding memory management bugs in programs
>
> *valgrind*.x86_64 : Tool for finding memory management bugs in programs
>
> valkyrie.x86_64 : Graphical User Interface for *Valgrind* Suite
>
>
>   Name and summary matches *only*, use "search all" for everything.
>
>
> 2016-09-24 14:18 GMT+02:00 Henrik Størner <henrik at hswn.dk 
> <mailto:henrik at hswn.dk>>:
>
>     Hi,
>
>     memory leaks are the worst to troubleshoot.
>
>     If possible, then running xymond_rrd via the "valgrind" tool is
>     the best way to do it. valgrind comes with some distributions, not
>     sure about RHEL though. There might be some CentOS packages that
>     will work.
>
>     An important point is that the binaries must be compiled with
>     debugging info intact; i.e. "-g" as a compile-time option,
>     preferably only -O optimisation, and not stripped. I guess Japheth
>     can help you with that, if necessary.
>
>     Then you change the tasks.cfg to run xymond_rrd via valgrind: The
>     CMD setting must then be
>
>     CMD valgrind --log-file=/tmp/valgrind-rrd.%p --leak-check=full \
>         xymond_channel --channel=status
>     --log=$XYMONSERVERLOGS/rrd-status.log xymond_rrd
>     --rrddir=$XYMONVAR/rrd
>
>     Then run Xymon normally for some time, until hopefully it starts
>     logging memory leaks.
>
>
>     This checking does have a significant performance impact, so
>     running it on a 4000-server system is probably not possible.
>
>
>     Regards,
>     Henrik
>
>
>
>     Den 23-09-2016 kl. 13:38 skrev Peter Welter:
>>     Hi Japheth,
>>
>>     Probable one process (xymon_rrd) seems very hungry for memory:
>>
>>     [xymon]# ps aux | egrep 'xymon|MEM'
>>
>>     USER       PID %CPU %MEM    VSZ   RSS TTY STAT START   TIME COMMAND
>>
>>     xymon   16889  0.0  0.0   4176   604 ?        S   13:26   0:00
>>     /bin/dash
>>
>>     xymon   16892  0.0  0.0   6272   660 ?        S   13:26   0:00
>>     vmstat 300 2
>>
>>     xymon   16986  0.0  0.0   4176   600 ?        S   13:28   0:00
>>     /bin/dash
>>
>>     xymon   16989  0.0  0.0   6272   664 ?        S   13:28   0:00
>>     vmstat 300 2
>>
>>     xymon   17060  0.0  0.0   4176   604 ?        S   13:30   0:00
>>     /bin/dash
>>
>>     xymon   17063  0.0  0.0   6272   664 ?        S   13:30   0:00
>>     vmstat 300 2
>>
>>     xymon   17107  0.5  0.1 140340 <tel:140340> 10324 ?        S   
>>     13:31   0:00 /usr/bin/perl -w -I/home/bbtest/server/ext
>>     /etc/xymon/ext/netapp/netapp.pl <http://netapp.pl>
>>
>>     xymon   17110  0.2  0.1 142236 11108 ?        S   13:31   0:00
>>     /usr/bin/perl -w -I/home/bbtest/server/ext
>>     /etc/xymon/ext/netapp/netapp.pl <http://netapp.pl>
>>
>>     xymon   17160  0.0  0.0 106120  1248 ?        S   13:31   0:00 sh
>>     -c /usr/bin/ssh -x -l xymon xxx.xxx.xxx.xxx "environment status" 2>&1
>>
>>     xymon   17161  0.0  0.0  60060  3440 ?        S   13:31   0:00
>>     /usr/bin/ssh -x -l xymon 10.10.1.30 environment status
>>
>>     root     17163  0.0  0.0 103324   852 pts/1 S+   13:31   0:00
>>     egrep xymon|MEM
>>
>>     xymon   27932  0.0  0.0  12648   592 ?        Ss   Sep20   0:05
>>     /usr/sbin/xymonlaunch --log=/var/log/xymon/xymonlaunch.log
>>
>>     xymon   27992  0.0  0.1 25212804 8160 ?       S   Sep20   1:57
>>     xymond --restart=/var/lib/xymon/tmp/xymond.chk
>>     --checkpoint-file=/var/lib/xymon/tmp/xymond.chk
>>     --checkpoint-interval=600
>>     --admin-senders=127.0.0.1,132.229.61.140 --store-clientlogs=!msgs
>>
>>     xymon   27996  0.0  0.0 12624444 1452 ?       S   Sep20   0:00
>>     xymond_channel --channel=stachg xymond_history
>>
>>     xymon   27997  0.0  0.0 12624444 1244 ?       S   Sep20   0:00
>>     xymond_channel --channel=page xymond_alert
>>     --checkpoint-file=/var/lib/xymon/tmp/alert.chk
>>     --checkpoint-interval=600
>>
>>     xymon   27998  0.0  0.0 12624444 1340 ?       S   Sep20   0:00
>>     xymond_channel --channel=client xymond_client
>>
>>     xymon   27999  0.0  0.0 12624860 4328 ?       S   Sep20   0:02
>>     xymond_channel --channel=status xymond_rrd
>>     --rrddir=/var/lib/xymon/rrd
>>
>>     xymon   28000  0.0  0.0 12625628 4712 ?       S   Sep20   0:00
>>     xymond_channel --channel=data xymond_rrd --rrddir=/var/lib/xymon/rrd
>>
>>     xymon   28001  0.0  0.0 12624444 1320 ?       S   Sep20   0:00
>>     xymond_channel --channel=clichg xymond_hostdata
>>
>>     xymon   28007  0.0  0.0  41788  1168 ?        S   Sep20   0:00
>>     xymond_channel --channel=user
>>     --log=/var/log/xymon/vmware-monitord.log vmware-monitord
>>
>>     xymon   28008  0.0  0.0 10527268 1688 ?       S   Sep20   0:00
>>     xymond_history
>>
>>     xymon   28009  0.0  1.5 12624884 122508 ?     S   Sep20   0:00
>>     xymond_client
>>
>>     xymon   28010  0.0  0.0 106848  2176 ?        S   Sep20   0:00
>>     /bin/gawk -f /usr/libexec/xymon/vmware-monitord
>>
>>     xymon   28011  0.0  0.0 10527252 1212 ?       S   Sep20   0:00
>>     xymond_hostdata
>>
>>     *xymon   28012  0.0  9.4 12680832 765216 ? S    Sep20   0:08
>>     xymond_rrd --rrddir=/var/lib/xymon/rrd*
>>
>>     *xymon   28013  0.0 12.1 12689484 975908 ? S    Sep20   0:12
>>     xymond_rrd --rrddir=/var/lib/xymon/rrd*
>>
>>     xymon 28014  0.0  0.1 10527512 9980 ?       S Sep20   0:00
>>     xymond_alert --checkpoint-file=/var/lib/xymon/tmp/alert.chk
>>     --checkpoint-interval=600
>>
>>     I did one test migration, were all hosts (about 4000 hosts) ran
>>     on this system. So the directory /var/lib/xymon/rrd is quite
>>     huge. However, currently there is only one host (xymon server
>>     itself) running and it is testing one netapp filer. So perhaps,
>>     xymon_rrd and this large directory are somehow related. I will
>>     have a try on the Accept environment which I have installed by
>>     now. There are just a few files in /var/lib/xymon/rrd on this
>>     Accept system, and I check next monday how each system will behave.
>>
>>     <So far an update; will be continued. next week..>
>>
>>
>>     2016-09-21 13:18 GMT+02:00 Peter Welter <peter.welter at gmail.com
>>     <mailto:peter.welter at gmail.com>>:
>>
>>         Hi Japheth,
>>
>>         Thanks for your response. I'm looking into this and will be
>>         back a.s.a.p. (a few days or so, since I just restarted Xymon ;-)
>>
>>         Peter
>>
>>         2016-09-20 19:07 GMT+02:00 Japheth Cleaver
>>         <cleaver at terabithia.org <mailto:cleaver at terabithia.org>>:
>>
>>             On 9/20/2016 8:37 AM, Peter Welter wrote:
>>
>>                 Hi J.C.,
>>
>>                 First of all: Thanks for your work for Xymon!
>>
>>                 Second: I have a question about the repository from
>>                 terabithia. I want to install an Development, Test 
>>                 Accept, Production environment with the use of this
>>                 repository. I installed first and are working on the
>>                 next phase.
>>
>>                 Over time however, I see that my Xymon-server seems
>>                 to eat all the memory available and starts swapping
>>                 until all memory is consumed?!?
>>
>>                 This is for Development only and there are no really
>>                 any tests. A very small host.cfg. So, why is over
>>                 time, Xymon this hungry for memory?
>>
>>                 Tue Sep 20 17:29:46 CEST 2016 - Memory CRITICAL
>>
>>                    Memory Used       Total  Percentage
>>                 green Real/Physical 7737M       7872M 98%
>>                 yellow Actual/Virtual  7539M       7872M 95%
>>                 red Swap/Page 3886M       4095M  94%
>>
>>                 After a Xymon restart, all the swap is freed?
>>
>>                 I'm using Red Hat Enterprise Linux Server release 6.8
>>                 (Santiago)
>>
>>                 Any suggestions what to do next? Thanks in advance
>>                 for any help!
>>
>>                 Peter
>>
>>
>>             Hi Peter,
>>
>>             I'm not aware of any memory leaks present in 4.3.27
>>             itself that would cause growth like that. Can you provide
>>             the ps output for the system's various xymon tools? Which
>>             process seems to be running out of control?
>>
>>             -jc
>>
>>
>>
>>
>>
>>     _______________________________________________
>>     Xymon mailing list
>>     Xymon at xymon.com <mailto:Xymon at xymon.com>
>>     http://lists.xymon.com/mailman/listinfo/xymon
>>     <http://lists.xymon.com/mailman/listinfo/xymon>
>     _______________________________________________ Xymon mailing list
>     Xymon at xymon.com <mailto:Xymon at xymon.com>
>     http://lists.xymon.com/mailman/listinfo/xymon
>     <http://lists.xymon.com/mailman/listinfo/xymon> 
>
> _______________________________________________
> Xymon mailing list
> Xymon at xymon.com
> http://lists.xymon.com/mailman/listinfo/xymon

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20160928/1d7c6c7c/attachment.html>


More information about the Xymon mailing list