[Xymon] xymond not accepting connections

Mills,David (HHSC Contractor) David.Mills at hhsc.state.tx.us
Fri Nov 18 17:53:53 CET 2016


OK... This is new: more details from xymonlaunch.log file:

                                ...
           2016-11-18 07:13:34 Loading saved state
           2016-11-18 07:13:59 Setting up network listener on 0.0.0.0:1984
           2016-11-18 07:13:59 Setting up signal handlers
           2016-11-18 07:13:59 Setting up xymond channels
           2016-11-18 07:13:59 FATAL: xymond sees clientcount 2, should be 0
           Check for hanging xymond_channel processes or stale semaphores
           2016-11-18 07:13:59 Cannot setup data channel
           2016-11-18 07:13:59 Task xymond terminated, status 1
           2016-11-18 07:13:59 Task xymongen terminated by signal 15
           2016-11-18 07:13:59 Task xymonnet terminated by signal 15
           2016-11-18 07:13:59 Loading hostnames
           2016-11-18 07:14:39 xgetenv: Cannot find value for variable HOME
           2016-11-18 07:17:41 xgetenv: Cannot find value for variable HOME
           2016-11-18 07:20:40 xgetenv: Cannot find value for variable HOME
           2016-11-18 07:23:30 Task xymonnetagain terminated, status 208
           ...
The above lines pretty  much cycle endlessly.

From: Xymon [mailto:xymon-bounces at xymon.com] On Behalf Of Mills,David (HHSC Contractor)
Sent: Thursday, November 17, 2016 5:17 PM
To: 'xymon at xymon.com'
Subject: [Xymon] xymond not accepting connections

Hi, all!

We have a rather murky situation. A colleague accidentally completely removed the Xymon (4.3.3 / Solaris) server home directory recently. It was restored from backups, but since then that server has not been completely functioning. ('Don't know if our symptoms are related to the home dir "zap" or what...)

We periodically run the ghostlist.cgi app from cron and now these instances sometimes don't exit. When I run truss on them, I see they are almost continuously calling brk(): allocating anonymous memory for that instance's heap. It's gotten so bad that we've had this server's resources completely depleted and now have had to turn off the cron jobs

The xymond daemon is no longer accepting connections, despite the fact that this server has been stable for years.

            The system was rebooted last night and seemed to be functioning throughout the night but stopped updating around 7:30 AM

            Confirmed xymond is no longer accepting connections via:

                                                17:03:49 pwsu020:/var/log/xymon> telnet 127.0.0.1 1984
                                                Trying 127.0.0.1...
                                                telnet: Unable to connect to remote host: Connection refused

                                                17:02:53 pwsu020:/var/log/xymon> ps -u hobbit -f
                                                                UID   PID  PPID   C    STIME TTY         TIME CMD
                                                  hobbit  4132  4131   1 17:02:36 ?           0:26 xymond --pidfile=/var/log/xymon/xymond.pid --restart=/export/xymon/server/tmp/x
                                                  hobbit  4288  4279   0 17:03:01 ?           0:01 /usr/bin/perl -w /usr/local/devmon/devmon
                                                  hobbit 12895 12867   0 10:54:01 pts/5       0:00 -bash
                                                  hobbit  4278  1908   0 17:03:00 ?           0:00 sh -c /usr/local/devmon/bin/restart.devmon> /dev/null 2>&1
                                                  hobbit 12491  5466   0 03:10:04 ?           0:00 /usr/local/apache2/bin/httpd -k start
                                                  hobbit 12490  5466   0 03:10:04 ?           0:00 /usr/local/apache2/bin/httpd -k start
                                                  hobbit 12487  5466   0 03:10:04 ?           0:00 /usr/local/apache2/bin/httpd -k start
                                                  hobbit 15958  5466   0 11:20:04 ?           0:00 /usr/local/apache2/bin/httpd -k start
                                                  hobbit 15612  5466   0 11:17:01 ?           0:00 /usr/local/apache2/bin/httpd -k start
                                                  hobbit 17158  5466   0 03:45:03 ?           0:00 /usr/local/apache2/bin/httpd -k start
                                                  hobbit 12488  5466   0 03:10:04 ?           0:00 /usr/local/apache2/bin/httpd -k start
                                                  hobbit 12489  5466   0 03:10:04 ?           0:00 /usr/local/apache2/bin/httpd -k start
                                                  hobbit  4290  4289   0 17:03:01 ?           0:01 /usr/bin/perl -w /usr/local/devmon/devmon
                                                  hobbit  4257  4143   0 17:02:46 ?           0:00 /home/hobbit/xymon/client/bin/xymon 0.0.0.0 @
                                                  hobbit 15776  5466   0 11:18:51 ?           0:00 /usr/local/apache2/bin/httpd -k start
                                                  hobbit  4148  4141   1 17:02:41 ?           0:21 /export/xymon/server/bin/xymonnet --ping --checkresponse --timeout=10 --dns-tim
                                                  hobbit  4279  4278   0 17:03:01 ?           0:00 /bin/ksh /usr/local/devmon/bin/restart.devmon
                                                  hobbit  4135  4131   0 17:02:41 ?           0:00 xymond_channel --channel=client --log=/var/log/xymon/clientdata.log xymond_clie
                                                  hobbit  4140  4131   1 17:02:41 ?           0:21 xymonnet --report --ping --checkresponse --timeout=10 --dns-timeout=2 --dnslog=
                                                  hobbit  4141  4131   0 17:02:41 ?           0:00 /bin/sh /export/xymon/server/ext/xymonnet-again.sh
                                                  hobbit  4137  4131   0 17:02:41 ?           0:00 xymond_channel --channel=data --log=/var/log/xymon/rrd-data.log xymond_rrd --rr
                                                  hobbit  4144  4131   0 17:02:41 ?           0:00 xymond_channel --channel=data --log=/var/log/xymon/data.log xymond_filestore --
                                                  hobbit  4131     1   0 17:02:36 ?           0:00 /export/xymon-4.3.3/server/bin/xymonlaunch --config=/export/xymon-4.3.3/server/
                                                  hobbit  4136  4131   0 17:02:41 ?           0:00 xymond_channel --channel=status --log=/var/log/xymon/rrd-status.log xymond_rrd
                                                  hobbit  4133  4131   0 17:02:41 ?           0:00 xymond_channel --channel=stachg --log=/var/log/xymon/history.log xymond_history
                                                  hobbit 12912 12885   0 10:54:09 pts/7       0:00 -bash
                                                  hobbit  4143  4131   0 17:02:41 ?           0:00 /bin/sh /export/xymon-4.3.3/client/bin/xymonclient.sh
                                                  hobbit  4289  4288   0 17:03:01 ?           0:00 /usr/bin/perl -w /usr/local/devmon/devmon
                                                  hobbit  4139  4131   1 17:02:41 ?           0:21 xymongen --recentgifs --subpagecolumns=2 --ignorecolumns=files --tooltips=never
                                                  hobbit  4134  4131   0 17:02:41 ?           0:00 xymond_channel --channel=page --log=/var/log/xymon/alert.log xymond_alert --che
                                                  hobbit  4138  4131   0 17:02:41 ?           0:00 xymond_channel --channel=clichg --log=/var/log/xymon/hostdata.log xymond_hostda

The only other clue I've been able to find is this note in the xymonlaunch.log file:

                                                15:29:24 pwsu020:/var/log/xymon> tail -50f xymonlaunch.log
                                                ...
                                                2016-11-17 13:54:36 xymonlaunch starting
                                                2016-11-17 13:54:36 Loading tasklist configuration from /export/xymon-4.3.3/server/etc/tasks.cfg
                                                2016-11-17 13:54:36 Loading hostnames
                                                2016-11-17 13:54:41 xgetenv: Cannot find value for variable HOME
                                                2016-11-17 13:57:44 xgetenv: Cannot find value for variable HOME
                                                2016-11-17 14:00:46 xgetenv: Cannot find value for variable HOME

            Yet, when I tried this, as well as grep'ing through xymonlaunch "truss" output for HOME, I see valid home directory values:

                                                13:58:35 pwsu020:~> echo 'echo HOME=$HOME XYMSRV=$XYMSRV XYMSERVERS=$XYMSERVERS XYMONDPORT=$XYMONDPORT' | /home/hobbit/xymon/client/bin/xymoncmd
                                                2016-11-17 13:58:38 Using default environment file /export/xymon-4.3.3/client/etc/xymonclient.cfg
                                                HOME=/home/hobbit XYMSRV=0.0.0.0 XYMSERVERS=10.235.57.11 10.235.157.56 XYMONDPORT=1984

Help!

david

~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~
David Mills
Systems Administrator
Northrop Grumman
(512) 595-1238 (mobile)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20161118/0a5512c5/attachment.html>


More information about the Xymon mailing list