[Xymon] xymongen hanging
David Logan
David.Logan at nt.gov.au
Mon Oct 17 06:57:25 CEST 2022
Hi Folks,
Just wondering if anybody has any experience with xymongen hanging. I have a large number of xymongen processes being kicked off sometime over the weekend, unfortunately they are owned by apache and have a PPID of 1 so I can't tell how they were started. I'm presuming either xymoncmd but I can't see anything in the crontab for xymon or in tasks.cfg that would kick off the snapshots and reporting processes.
These then sit for a very long time (> 24hrs) while trying to read a data file from a specific server.
apache 14749 1 44 Oct16 ? 10:28:39 /xymon/server/server/bin/xymongen --snapshot=2222867979 XYMONGENSNAPOPTS /xymon/server/server/www/snap/14748-1665896723
apache 14867 1 43 Oct16 ? 10:26:32 /xymon/server/server/bin/xymongen --snapshot=2222867979 XYMONGENSNAPOPTS /xymon/server/server/www/snap/14866-1665896747
apache 15107 1 43 Oct16 ? 10:26:05 /xymon/server/server/bin/xymongen --snapshot=2222867979 XYMONGENSNAPOPTS /xymon/server/server/www/snap/15106-1665896768
apache 15118 1 43 Oct16 ? 10:25:58 /xymon/server/server/bin/xymongen --snapshot=2222867979 XYMONGENSNAPOPTS /xymon/server/server/www/snap/15117-1665896774
apache 15125 1 43 Oct16 ? 10:25:12 /xymon/server/server/bin/xymongen --snapshot=2222867979 XYMONGENSNAPOPTS /xymon/server/server/www/snap/15124-1665896783
apache 15238 1 43 Oct16 ? 10:23:26 /xymon/server/server/bin/xymongen --reportopts=2222871640:2222958039:1: /xymon/server/server/www/rep/15237-1665896797
apache 15269 1 43 Oct16 ? 10:25:31 /xymon/server/server/bin/xymongen --snapshot=2222867979 XYMONGENSNAPOPTS /xymon/server/server/www/snap/15268-1665896804
apache 15349 1 43 Oct16 ? 10:22:20 /xymon/server/server/bin/xymongen --snapshot=2222867979 XYMONGENSNAPOPTS /xymon/server/server/www/snap/15348-1665896807
apache 15382 1 43 Oct16 ? 10:23:40 /xymon/server/server/bin/xymongen --reportopts=2222871640:2222958039:1: /xymon/server/server/www/rep/15381-1665896828
apache 15398 1 43 Oct16 ? 10:25:13 /xymon/server/server/bin/xymongen --snapshot=2222867979 XYMONGENSNAPOPTS /xymon/server/server/www/snap/15397-1665896834
apache 15400 1 43 Oct16 ? 10:22:59 /xymon/server/server/bin/xymongen --reportopts=2222871640:2222958039:1: /xymon/server/server/www/rep/15399-1665896837
apache 15757 1 43 Oct16 ? 10:24:48 /xymon/server/server/bin/xymongen --reportopts=2222871640:2222958039:1: /xymon/server/server/www/rep/15756-1665896864
apache 15842 1 43 Oct16 ? 10:22:32 /xymon/server/server/bin/xymongen --reportopts=2222871640:2222958039:1: /xymon/server/server/www/rep/15841-1665896873
apache 15964 1 43 Oct16 ? 10:24:21 /xymon/server/server/bin/xymongen --reportopts=2222871640:2222958039:1: /xymon/server/server/www/rep/15963-1665896897
apache 15996 1 43 Oct16 ? 10:22:25 /xymon/server/server/bin/xymongen --reportopts=2222871640:2222958039:1: /xymon/server/server/www/rep/15995-1665896912
apache 16133 1 43 Oct16 ? 10:22:07 /xymon/server/server/bin/xymongen --reportopts=2222871640:2222958039:1: /xymon/server/server/www/rep/16132-1665896933
apache 16149 1 43 Oct16 ? 10:23:37 /xymon/server/server/bin/xymongen --reportopts=2222871640:2222958039:1: /xymon/server/server/www/rep/16148-1665896954
apache 16215 1 43 Oct16 ? 10:23:45 /xymon/server/server/bin/xymongen --reportopts=2222871640:2222958039:1: /xymon/server/server/www/rep/16214-1665896972
An strace for the first pid is as follows (they are all the same) and looking at file descriptor 3
[root at dcslmonitor 15238]# strace -f -p 14749
Process 14749 attached
read(3, "", 4096) = 0
read(3, "", 4096) = 0
read(3, "", 4096) = 0
read(3, "", 4096) = 0
read(3, "", 4096) = 0
read(3, "", 4096) = 0
read(3, "", 4096) = 0
read(3, "", 4096) = 0
read(3, "", 4096) = 0
read(3, "", 4096) = 0
read(3, "", 4096) = 0
read(3, "", 4096) = 0
read(3, "", 4096) = 0
read(3, "", 4096) = 0
read(3, "", 4096) = 0
read(3, "", 4096) = 0
fd3 is
xymongen 14749 apache cwd DIR 253,0 6 134320195 /xymon/server/data/acks
xymongen 14749 apache rtd DIR 8,2 269 64 /
xymongen 14749 apache txt REG 253,0 1106256 135222190 /xymon/server/server/bin/xymongen
xymongen 14749 apache mem REG 8,6 155784 4448319 /usr/lib64/libselinux.so.1
xymongen 14749 apache mem REG 8,6 109976 4873245 /usr/lib64/libresolv-2.17.so
xymongen 14749 apache mem REG 8,6 15688 4259351 /usr/lib64/libkeyutils.so.1.5
xymongen 14749 apache mem REG 8,6 67104 4471490 /usr/lib64/libkrb5support.so.0.1
xymongen 14749 apache mem REG 8,6 142144 4873243 /usr/lib64/libpthread-2.17.so
xymongen 14749 apache mem REG 8,6 90632 4195838 /usr/lib64/libz.so.1.2.7
xymongen 14749 apache mem REG 8,6 19248 4358022 /usr/lib64/libdl-2.17.so
xymongen 14749 apache mem REG 8,6 210824 4471445 /usr/lib64/libk5crypto.so.3.1
xymongen 14749 apache mem REG 8,6 15920 4939663 /usr/lib64/libcom_err.so.2.1
xymongen 14749 apache mem REG 8,6 967840 4259800 /usr/lib64/libkrb5.so.3.3
xymongen 14749 apache mem REG 8,6 320400 4256684 /usr/lib64/libgssapi_krb5.so.2.2
xymongen 14749 apache mem REG 8,6 2156272 4262067 /usr/lib64/libc-2.17.so
xymongen 14749 apache mem REG 8,6 402384 4259730 /usr/lib64/libpcre.so.1.2.0
xymongen 14749 apache mem REG 8,6 2521008 4256674 /usr/lib64/libcrypto.so.1.0.2k
xymongen 14749 apache mem REG 8,6 470360 4195836 /usr/lib64/libssl.so.1.0.2k
xymongen 14749 apache mem REG 8,6 163312 4448246 /usr/lib64/ld-2.17.so
xymongen 14749 apache 0r FIFO 0,8 0t0 404824379 pipe
xymongen 14749 apache 1w FIFO 0,8 0t0 404824380 pipe
xymongen 14749 apache 2w FIFO 0,8 0t0 404824381 pipe
xymongen 14749 apache 3r REG 253,0 524 67195718 /xymon/server/data/hist/accessntg.sslcert
Every process (in the process list above) shows they have the same file open as fd3, are they locking each other out or more to the point, should they be?
Any ideas on where to look or what to do next?
Thanks
David Logan
Senior Systems Administrator
Data Centre Services
Department of Corporate and Digital Development | Northern Territory Government
GPO Box 2391, Darwin, NT 0801, Australia
DCS Midrange Ticketing System
p ... <+61> 8 8999 6968
m ... <+61> 458 631 117 New and Existing tickets: http://dcscentral.nt.gov.au/
e ... david.logan at nt.gov.au<mailto:david.logan at nt.gov.au> or dcs_service at nt.gov.au<mailto:dcs_service at nt.gov.au>
w ... www.nt.gov.au<http://www.nt.gov.au/> Escalations: (08) 8999 7654
Our vision: improve government through services and solutions that exceed expectations
Our values: Honest | Professional | Respectful | Accountable | Innovative
The information in this e-mail is intended solely for the addressee named. It may contain legally privileged or confidential information that is subject to copyright. If you are not the intended recipient you must not use, disclose copy or distribute this communication. If you have received this message in error, please delete the e-mail and notify the sender. No representation is made that this e-mail is free of viruses. Virus scanning is recommended and is the responsibility of the recipient.
Please consider the environment before printing this email.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20221017/290d3617/attachment.htm>
More information about the Xymon
mailing list