[Xymon] Could not fork checkpoint child:Cannot allocate memory

Raul GN ragonlan at gmail.com
Tue Jul 22 10:09:49 CEST 2014


Yes, I can provide you all information you need. Xymon was upgraded on 15
July and this is memory graph:

[image: Inline image 1]

Xymon is instaled in Debian 6.0.9.

This are our ulimit configuration. We didn't change it so it should be
defaults values:
-----------------------------------------------------------------------
root: #ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 63975
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 63975
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

xymon: $ ulimit -a
time(seconds)        unlimited
file(blocks)         unlimited
data(kbytes)         unlimited
stack(kbytes)        8192
coredump(blocks)     0
memory(kbytes)       unlimited
locked memory(kbytes) 64
process              63975
nofiles              1024
vmemory(kbytes)      unlimited
locks                unlimited
 ------------------------------------------------------------

We monitor 1200 hosts and don't add or remove many hosts. May be 20 host a
week:


[image: Inline image 2]

xymond test is:
----------------------------------------------------------------
Statistics for Xymon daemon
Version: 4.3.17
Up since 21-Jul-2014 11:37:37 (0 days, 22:00:00)

Incoming messages      :    4335475
- status               :    2251363
- combo                :      24050
- extcombo             :     104869
- page                 :        160
- summary              :          0
- data                 :    1354532
- client               :      92341
- notes                :          0
- enable               :          0
- disable              :          0
- ack                  :          2
- config               :       5968
- query                :       4471
- xymondboard          :     146651
- xymondlog            :     341645
- drop                 :          3
- rename               :          0
- dummy                :        328
- ping                 :          0
- notify               :          0
- schedule             :        298
- download             :          0
- Bogus/Timeouts       :       8794
Incoming messages/sec  :         52 (average last 300 seconds)

status channel messages:    2244782 (1 readers)
stachg channel messages:      10913 (1 readers)
page   channel messages:      49720 (1 readers)
data   channel messages:    1349077 (1 readers)
notes  channel messages:          0 (0 readers)
enadis channel messages:          0 (0 readers)
client channel messages:      90465 (1 readers)
clichg channel messages:        360 (1 readers)
user   channel messages:          0 (0 readers)
backfeed messages      :          0

Ghost reports:
  10.6.71.66      reported host
  10.6.42.103     reported host 10.6.42.10
  10.6.42.103     reported host 10.6.42.11
  10.6.42.103     reported host 10.6.42.12
  10.6.42.103     reported host 10.6.42.13
  10.6.42.103     reported host 10.6.42.14
  10.6.42.103     reported host 10.6.42.15
  10.6.42.103     reported host 10.6.42.4
  10.6.42.103     reported host 10.6.42.5
  10.1.0.194      reported host ePagess_app_3
  10.1.0.86       reported host IIS6_6_1
  10.1.0.87       reported host IIS6_6_2
  10.1.0.194      reported host main0404


Multi-source statuses
  admin01:conn              reported by 10.6.42.103 and 10.0.0.29
----------------------------------------------------------------------------------------------------


And xymond section in task.cfg:

[xymond]
    ENVFILE /usr/lib/xymon/server/etc/xymonserver.cfg
    CMD xymond --pidfile=$XYMONSERVERLOGS/xymond.pid \
        --restart=$XYMONTMP/xymond.chk
--checkpoint-file=$XYMONTMP/xymond.chk --checkpoint-interval=600 \
        --log=$XYMONSERVERLOGS/xymond.log \
        --admin-senders=127.0.0.1,$XYMONSERVERIP \
        --store-clientlogs=!msgs \
        --maint-senders=127.0.0.1,$XYMONSERVERIP \
        --www-senders=127.0.0.1,10.0.0.0/24,10.6.42.103 \
        --flap-count=10 \
        --flap-seconds=900


We also change some MAX variables in xymonserver.cfg:
MAXLINE="32768"
MAXMSG_DATA="5242880"
MAXMSG_CLIENT="5242880"
MAXMSG_STATUS="5242880"

Thank you for you help.


On Mon, Jul 21, 2014 at 7:12 PM, J.C. Cleaver <cleaver at terabithia.org>
wrote:

>
>
> On Mon, July 21, 2014 3:24 am, Raul GN wrote:
> > After upgrading from xymon version 4.3.12 to 4.3.17  xymond daemon memory
> > grow without any limit. After a 2 or 3 days this messages appear in logs:
> >
> > 2014-07-17 16:27:46 Setup complete
> > 2014-07-18 04:16:49 Flapping detected for web.int:http - 10 changes in
> 868
> > seconds
> > 2014-07-18 04:16:49 Flapping detected for web.int:tomcat - 10 changes in
> > 868 seconds
> > 2014-07-18 04:18:23 Flapping detected for web.int:http - 10 changes in
> 892
> > seconds
> > 2014-07-18 04:18:23 Flapping detected for web.int:tomcat - 10 changes in
> > 892 seconds
> > 2014-07-18 18:25:44 Flapping detected for web.int:http - 10 changes in
> 808
> > seconds
> > 2014-07-18 18:25:44 Flapping detected for web.int:tomcat - 10 changes in
> > 808 seconds
> > 2014-07-18 23:40:53 Could not fork checkpoint child:Cannot allocate
> memory
> > 2014-07-18 23:50:54 Could not fork checkpoint child:Cannot allocate
> memory
> > 2014-07-19 00:00:55 Could not fork checkpoint child:Cannot allocate
> memory
> > 2014-07-19 00:10:56 Could not fork checkpoint child:Cannot allocate
> memory
> > 2014-07-19 00:20:57 Could not fork checkpoint child:Cannot allocate
> memory
> > 2014-07-19 00:30:58 Could not fork checkpoint child:Cannot allocate
> memory
> > 2014-07-19 00:40:59 Could not fork checkpoint child:Cannot allocate
> memory
> > 2014-07-19 00:51:00 Could not fork checkpoint child:Cannot allocate
> memory
> > 2014-07-19 01:01:01 Could not fork checkpoint child:Cannot allocate
> memory
> > 2014-07-19 01:11:02 Could not fork checkpoint child:Cannot allocate
> memory
> > 2014-07-19 01:21:03 Could not fork checkpoint child:Cannot allocate
> memory
> > 2014-07-19 01:31:04 Could not fork checkpoint child:Cannot allocate
> memory
> > 2014-07-19 01:41:05 Could not fork checkpoint child:Cannot allocate
> memory
> > 2014-07-19 01:51:06 Could not fork checkpoint child:Cannot allocate
> memory
> > 2014-07-19 02:01:07 Could not fork checkpoint child:Cannot allocate
> memory
> > 2014-07-19 02:11:08 Could not fork checkpoint child:Cannot allocate
> memory
> >
> > żAnybody knows how to avoid this problem? If I don't reboot xymon it
> > crash.
> > _______________________________________________
>
>
> Hmm. If it's growing truly without limit there's something unusual going
> on; I'd take the memory allocation error later on at face value.
>
> Can you provide any additional details? Do you have an unusual workload or
> ulimits on the xymon user? Or a large number of host inserts/removals?
> What OS are you running?
>
>
> Regards,
>
> -jc
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20140722/b6156054/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Monitored items.png
Type: image/png
Size: 16138 bytes
Desc: not available
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20140722/b6156054/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Memory.png
Type: image/png
Size: 34905 bytes
Desc: not available
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20140722/b6156054/attachment-0001.png>


More information about the Xymon mailing list