[Xymon] Possible Memory Leak (?!) in Version Xymon 4.3.27-1.el6.terabithia
Peter Welter
peter.welter at gmail.com
Wed Sep 28 10:27:55 CEST 2016
Hi Henrik, J.C.,
Thanks for your response.
It seems that valgrind is available for RHEL (see below) and now I wanted
to ask J.C. the following: "What do you want me to do?"
If I want to use the prebuild packages, and YES that would be preferable,
then can you supply me with a pre-compiles binary for xymond_rrd that has
all the options Henrik talked about? So I can replace this with the
currently installed image?
Or should I build a package my self to debug this issue?
Regards, Peter
[root at uhu-a xymon]# yum search valgrind
Loaded plugins: product-id, search-disabled-repos, security,
subscription-manager
==================================================================================================================================
N/S Matched: valgrind
==================================================================================================================================
devtoolset-1.1-*valgrind*-devel.i686 : Development files for *valgrind*
devtoolset-1.1-*valgrind*-devel.x86_64 : Development files for *valgrind*
devtoolset-1.1-*valgrind*-openmpi.i686 : OpenMPI support for *valgrind*
devtoolset-1.1-*valgrind*-openmpi.x86_64 : OpenMPI support for *valgrind*
devtoolset-2-eclipse-*valgrind*.noarch : *Valgrind* Tools Integration for
Eclipse
devtoolset-2-*valgrind*-devel.i686 : Development files for *valgrind*
devtoolset-2-*valgrind*-devel.x86_64 : Development files for *valgrind*
devtoolset-2-*valgrind*-openmpi.i686 : OpenMPI support for *valgrind*
devtoolset-2-*valgrind*-openmpi.x86_64 : OpenMPI support for *valgrind*
eclipse-*valgrind*.x86_64 : *Valgrind* Tools Integration for Eclipse
perl-Test-*Valgrind*.noarch : Generate suppressions, analyze and test any
command with *valgrind*
*valgrind*-devel.i686 : Development files for *valgrind*
*valgrind*-devel.x86_64 : Development files for *valgrind*
*valgrind*-openmpi.x86_64 : OpenMPI support for *valgrind*
devtoolset-1.1-*valgrind*.i686 : Tool for finding memory management bugs in
programs
devtoolset-1.1-*valgrind*.x86_64 : Tool for finding memory management bugs
in programs
devtoolset-2-*valgrind*.i686 : Tool for finding memory management bugs in
programs
devtoolset-2-*valgrind*.x86_64 : Tool for finding memory management bugs in
programs
*valgrind*.i686 : Tool for finding memory management bugs in programs
*valgrind*.x86_64 : Tool for finding memory management bugs in programs
valkyrie.x86_64 : Graphical User Interface for *Valgrind* Suite
Name and summary matches *only*, use "search all" for everything.
2016-09-24 14:18 GMT+02:00 Henrik Størner <henrik at hswn.dk>:
> Hi,
>
> memory leaks are the worst to troubleshoot.
>
> If possible, then running xymond_rrd via the "valgrind" tool is the best
> way to do it. valgrind comes with some distributions, not sure about RHEL
> though. There might be some CentOS packages that will work.
>
> An important point is that the binaries must be compiled with debugging
> info intact; i.e. "-g" as a compile-time option, preferably only -O
> optimisation, and not stripped. I guess Japheth can help you with that, if
> necessary.
>
> Then you change the tasks.cfg to run xymond_rrd via valgrind: The CMD
> setting must then be
>
> CMD valgrind --log-file=/tmp/valgrind-rrd.%p --leak-check=full \
> xymond_channel --channel=status --log=$XYMONSERVERLOGS/rrd-status.log
> xymond_rrd --rrddir=$XYMONVAR/rrd
> Then run Xymon normally for some time, until hopefully it starts logging
> memory leaks.
>
>
> This checking does have a significant performance impact, so running it on
> a 4000-server system is probably not possible.
>
>
> Regards,
> Henrik
>
>
>
> Den 23-09-2016 kl. 13:38 skrev Peter Welter:
>
> Hi Japheth,
>
> Probable one process (xymon_rrd) seems very hungry for memory:
>
> [xymon]# ps aux | egrep 'xymon|MEM'
>
> USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
>
> xymon 16889 0.0 0.0 4176 604 ? S 13:26 0:00 /bin/dash
>
> xymon 16892 0.0 0.0 6272 660 ? S 13:26 0:00 vmstat
> 300 2
>
> xymon 16986 0.0 0.0 4176 600 ? S 13:28 0:00 /bin/dash
>
> xymon 16989 0.0 0.0 6272 664 ? S 13:28 0:00 vmstat
> 300 2
>
> xymon 17060 0.0 0.0 4176 604 ? S 13:30 0:00 /bin/dash
>
> xymon 17063 0.0 0.0 6272 664 ? S 13:30 0:00 vmstat
> 300 2
>
> xymon 17107 0.5 0.1 140340 10324 ? S 13:31 0:00
> /usr/bin/perl -w -I/home/bbtest/server/ext /etc/xymon/ext/netapp/netapp.pl
>
> xymon 17110 0.2 0.1 142236 11108 ? S 13:31 0:00
> /usr/bin/perl -w -I/home/bbtest/server/ext /etc/xymon/ext/netapp/netapp.pl
>
> xymon 17160 0.0 0.0 106120 1248 ? S 13:31 0:00 sh -c
> /usr/bin/ssh -x -l xymon xxx.xxx.xxx.xxx "environment status" 2>&1
>
> xymon 17161 0.0 0.0 60060 3440 ? S 13:31 0:00
> /usr/bin/ssh -x -l xymon 10.10.1.30 environment status
>
> root 17163 0.0 0.0 103324 852 pts/1 S+ 13:31 0:00 egrep
> xymon|MEM
>
> xymon 27932 0.0 0.0 12648 592 ? Ss Sep20 0:05
> /usr/sbin/xymonlaunch --log=/var/log/xymon/xymonlaunch.log
>
> xymon 27992 0.0 0.1 25212804 8160 ? S Sep20 1:57 xymond
> --restart=/var/lib/xymon/tmp/xymond.chk --checkpoint-file=/var/lib/xymon/tmp/xymond.chk
> --checkpoint-interval=600 --admin-senders=127.0.0.1,132.229.61.140
> --store-clientlogs=!msgs
>
> xymon 27996 0.0 0.0 12624444 1452 ? S Sep20 0:00
> xymond_channel --channel=stachg xymond_history
>
> xymon 27997 0.0 0.0 12624444 1244 ? S Sep20 0:00
> xymond_channel --channel=page xymond_alert --checkpoint-file=/var/lib/xymon/tmp/alert.chk
> --checkpoint-interval=600
>
> xymon 27998 0.0 0.0 12624444 1340 ? S Sep20 0:00
> xymond_channel --channel=client xymond_client
>
> xymon 27999 0.0 0.0 12624860 4328 ? S Sep20 0:02
> xymond_channel --channel=status xymond_rrd --rrddir=/var/lib/xymon/rrd
>
> xymon 28000 0.0 0.0 12625628 4712 ? S Sep20 0:00
> xymond_channel --channel=data xymond_rrd --rrddir=/var/lib/xymon/rrd
>
> xymon 28001 0.0 0.0 12624444 1320 ? S Sep20 0:00
> xymond_channel --channel=clichg xymond_hostdata
>
> xymon 28007 0.0 0.0 41788 1168 ? S Sep20 0:00
> xymond_channel --channel=user --log=/var/log/xymon/vmware-monitord.log
> vmware-monitord
>
> xymon 28008 0.0 0.0 10527268 1688 ? S Sep20 0:00
> xymond_history
>
> xymon 28009 0.0 1.5 12624884 122508 ? S Sep20 0:00
> xymond_client
>
> xymon 28010 0.0 0.0 106848 2176 ? S Sep20 0:00 /bin/gawk
> -f /usr/libexec/xymon/vmware-monitord
>
> xymon 28011 0.0 0.0 10527252 1212 ? S Sep20 0:00
> xymond_hostdata
>
> *xymon 28012 0.0 9.4 12680832 765216 ? S Sep20 0:08
> xymond_rrd --rrddir=/var/lib/xymon/rrd*
>
> *xymon 28013 0.0 12.1 12689484 975908 ? S Sep20 0:12
> xymond_rrd --rrddir=/var/lib/xymon/rrd*
>
> xymon 28014 0.0 0.1 10527512 9980 ? S Sep20 0:00
> xymond_alert --checkpoint-file=/var/lib/xymon/tmp/alert.chk
> --checkpoint-interval=600
> I did one test migration, were all hosts (about 4000 hosts) ran on this
> system. So the directory /var/lib/xymon/rrd is quite huge. However,
> currently there is only one host (xymon server itself) running and it is
> testing one netapp filer. So perhaps, xymon_rrd and this large directory
> are somehow related. I will have a try on the Accept environment which I
> have installed by now. There are just a few files in /var/lib/xymon/rrd on
> this Accept system, and I check next monday how each system will behave.
>
> <So far an update; will be continued. next week..>
>
>
> 2016-09-21 13:18 GMT+02:00 Peter Welter <peter.welter at gmail.com>:
>
>> Hi Japheth,
>>
>> Thanks for your response. I'm looking into this and will be back a.s.a.p.
>> (a few days or so, since I just restarted Xymon ;-)
>>
>> Peter
>>
>> 2016-09-20 19:07 GMT+02:00 Japheth Cleaver <cleaver at terabithia.org>:
>>
>>> On 9/20/2016 8:37 AM, Peter Welter wrote:
>>>
>>>> Hi J.C.,
>>>>
>>>> First of all: Thanks for your work for Xymon!
>>>>
>>>> Second: I have a question about the repository from terabithia. I want
>>>> to install an Development, Test Accept, Production environment with the
>>>> use of this repository. I installed first and are working on the next phase.
>>>>
>>>> Over time however, I see that my Xymon-server seems to eat all the
>>>> memory available and starts swapping until all memory is consumed?!?
>>>>
>>>> This is for Development only and there are no really any tests. A very
>>>> small host.cfg. So, why is over time, Xymon this hungry for memory?
>>>>
>>>> Tue Sep 20 17:29:46 CEST 2016 - Memory CRITICAL
>>>>
>>>> Memory Used Total Percentage
>>>> green Real/Physical 7737M 7872M 98%
>>>> yellow Actual/Virtual 7539M 7872M 95%
>>>> red Swap/Page 3886M 4095M 94%
>>>>
>>>> After a Xymon restart, all the swap is freed?
>>>>
>>>> I'm using Red Hat Enterprise Linux Server release 6.8 (Santiago)
>>>>
>>>> Any suggestions what to do next? Thanks in advance for any help!
>>>>
>>>> Peter
>>>>
>>>
>>> Hi Peter,
>>>
>>> I'm not aware of any memory leaks present in 4.3.27 itself that would
>>> cause growth like that. Can you provide the ps output for the system's
>>> various xymon tools? Which process seems to be running out of control?
>>>
>>> -jc
>>>
>>
>>
>
>
> _______________________________________________
> Xymon mailing listXymon at xymon.comhttp://lists.xymon.com/mailman/listinfo/xymon
>
>
>
> _______________________________________________
> Xymon mailing list
> Xymon at xymon.com
> http://lists.xymon.com/mailman/listinfo/xymon
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20160928/e8947b1a/attachment.html>
More information about the Xymon
mailing list