<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">This is (probably) a sign that you have
stuck SysV IPC semaphores, probably from the previous crash.<br>
<br>
The fix is to stop all xymon/hobbit processes, and then remove the
hobbit-owned IPC stuff manually. On Linux, you'd run ipcs -a to
find any segments (also, queues and arrays) owned by the
xymon/hobbit user and use ipcrm to remove them.<br>
<br>
ipcs output on a running system will look something like this:<br>
<pre style="color: rgb(0, 0, 0); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; word-wrap: break-word; white-space: pre-wrap;">------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x010045d6 0 xymon 600 262144 4
0x020045d6 32769 xymon 600 262144 2
0x030045d6 65538 xymon 600 262144 2
0x040045d6 98307 xymon 600 262144 3
0x050045d6 131076 xymon 600 262144 2
0x060045d6 163845 xymon 600 32768 2
0x070045d6 196614 xymon 600 26214400 3
0x080045d6 229383 xymon 600 26214400 2
0x090045d6 262152 xymon 600 131072 1
</pre>
I'm not sure if the commands are the same on Solaris (don't have a
Sun box handy at the moment), but once those are gone things
should start back up.<br class="Apple-interchange-newline">
<br>
This is reduced to an error in 4.x instead of an abort, but the
behavior is still undefined since it's easy for xymond to get into
a deadlock with pre-existing semaphores set, while we wait for a
message to be picked up by a process that may not exist.<br>
<br>
<br>
HTH,<br>
-jc<br>
<br>
<br>
On 11/18/2016 8:53 AM, Mills,David (HHSC Contractor) wrote:<br>
</div>
<blockquote
cite="mid:SN1PR05MB23044738D6360B19F39874979EB00@SN1PR05MB2304.namprd05.prod.outlook.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 14 (filtered
medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
margin-top:0in;
margin-right:0in;
margin-bottom:0in;
margin-left:.5in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
span.EmailStyle18
{mso-style-type:personal;
font-family:"Calibri","sans-serif";
color:windowtext;}
span.EmailStyle19
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
<div class="WordSection1">
<p class="MsoNormal"><span style="color:#1F497D">OK… This is
new: more details from xymonlaunch.log file:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">
</span>
<span style="font-family:"Courier
New";color:#1F497D">…<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier
New";color:#1F497D"> 2016-11-18 07:13:34
Loading saved state<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier
New";color:#1F497D"> 2016-11-18 07:13:59
Setting up network listener on 0.0.0.0:1984<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier
New";color:#1F497D"> 2016-11-18 07:13:59
Setting up signal handlers<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier
New";color:#1F497D"> 2016-11-18 07:13:59
Setting up xymond channels<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier
New";color:#1F497D"> 2016-11-18 07:13:59
FATAL: xymond sees clientcount 2, should be 0<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier
New";color:#1F497D"> Check for hanging
xymond_channel processes or stale semaphores<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier
New";color:#1F497D"> 2016-11-18 07:13:59
Cannot setup data channel<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier
New";color:#1F497D"> 2016-11-18 07:13:59 Task
xymond terminated, status 1<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier
New";color:#1F497D"> 2016-11-18 07:13:59 Task
xymongen terminated by signal 15<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier
New";color:#1F497D"> 2016-11-18 07:13:59 Task
xymonnet terminated by signal 15<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier
New";color:#1F497D"> 2016-11-18 07:13:59
Loading hostnames<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier
New";color:#1F497D"> 2016-11-18 07:14:39
xgetenv: Cannot find value for variable HOME<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier
New";color:#1F497D"> 2016-11-18 07:17:41
xgetenv: Cannot find value for variable HOME<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier
New";color:#1F497D"> 2016-11-18 07:20:40
xgetenv: Cannot find value for variable HOME<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier
New";color:#1F497D"> 2016-11-18 07:23:30 Task
xymonnetagain terminated, status 208<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier
New";color:#1F497D"> …<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">The above lines
pretty much cycle endlessly.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<div>
<div style="border:none;border-top:solid #B5C4DF
1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">
Xymon [<a class="moz-txt-link-freetext" href="mailto:xymon-bounces@xymon.com">mailto:xymon-bounces@xymon.com</a>]
<b>On Behalf Of </b>Mills,David (HHSC Contractor)<br>
<b>Sent:</b> Thursday, November 17, 2016 5:17 PM<br>
<b>To:</b> '<a class="moz-txt-link-abbreviated" href="mailto:xymon@xymon.com">xymon@xymon.com</a>'<br>
<b>Subject:</b> [Xymon] xymond not accepting connections<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Hi, all!<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">We have a rather murky situation. A
colleague accidentally completely removed the Xymon (4.3.3 /
Solaris) server home directory recently. It was restored from
backups, but since then that server has not been completely
functioning. (‘Don’t know if our symptoms are related to the
home dir “zap” or what…)<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">We periodically run the ghostlist.cgi app
from cron and now these instances sometimes don’t exit. When I
run truss on them, I see they are almost continuously calling
brk(): allocating anonymous memory for that instance’s heap.
It’s gotten so bad that we’ve had this server’s resources
completely depleted and now have had to turn off the cron jobs<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">The xymond daemon is no longer accepting
connections, despite the fact that this server has been stable
for years.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"> The system was rebooted last
night and seemed to be functioning throughout the night but
stopped updating around 7:30 AM<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"> Confirmed xymond is no longer
accepting connections via:<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">
17:03:49 pwsu020:/var/log/xymon> telnet 127.0.0.1 1984<o:p></o:p></p>
<p class="MsoNormal">
Trying 127.0.0.1...<o:p></o:p></p>
<p class="MsoNormal">
telnet: Unable to connect to remote host: Connection refused<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:1.0in"><o:p> </o:p></p>
<p class="MsoNormal">
17:02:53 pwsu020:/var/log/xymon> ps -u hobbit -f<o:p></o:p></p>
<p class="MsoNormal">
UID PID PPID C STIME TTY TIME CMD<o:p></o:p></p>
<p class="MsoNormal">
hobbit 4132 4131 1 17:02:36 ? 0:26 xymond
--pidfile=/var/log/xymon/xymond.pid
--restart=/export/xymon/server/tmp/x<o:p></o:p></p>
<p class="MsoNormal">
hobbit 4288 4279 0 17:03:01 ? 0:01
/usr/bin/perl -w /usr/local/devmon/devmon<o:p></o:p></p>
<p class="MsoNormal">
hobbit 12895 12867 0 10:54:01 pts/5 0:00 -bash<o:p></o:p></p>
<p class="MsoNormal">
hobbit 4278 1908 0 17:03:00 ? 0:00 sh -c
/usr/local/devmon/bin/restart.devmon> /dev/null 2>&1<o:p></o:p></p>
<p class="MsoNormal">
hobbit 12491 5466 0 03:10:04 ? 0:00
/usr/local/apache2/bin/httpd -k start<o:p></o:p></p>
<p class="MsoNormal">
hobbit 12490 5466 0 03:10:04 ? 0:00
/usr/local/apache2/bin/httpd -k start<o:p></o:p></p>
<p class="MsoNormal">
hobbit 12487 5466 0 03:10:04 ? 0:00
/usr/local/apache2/bin/httpd -k start<o:p></o:p></p>
<p class="MsoNormal">
hobbit 15958 5466 0 11:20:04 ? 0:00
/usr/local/apache2/bin/httpd -k start<o:p></o:p></p>
<p class="MsoNormal">
hobbit 15612 5466 0 11:17:01 ? 0:00
/usr/local/apache2/bin/httpd -k start<o:p></o:p></p>
<p class="MsoNormal">
hobbit 17158 5466 0 03:45:03 ? 0:00
/usr/local/apache2/bin/httpd -k start<o:p></o:p></p>
<p class="MsoNormal">
hobbit 12488 5466 0 03:10:04 ? 0:00
/usr/local/apache2/bin/httpd -k start<o:p></o:p></p>
<p class="MsoNormal">
hobbit 12489 5466 0 03:10:04 ? 0:00
/usr/local/apache2/bin/httpd -k start<o:p></o:p></p>
<p class="MsoNormal">
hobbit 4290 4289 0 17:03:01 ? 0:01
/usr/bin/perl -w /usr/local/devmon/devmon<o:p></o:p></p>
<p class="MsoNormal">
hobbit 4257 4143 0 17:02:46 ? 0:00
/home/hobbit/xymon/client/bin/xymon 0.0.0.0 @<o:p></o:p></p>
<p class="MsoNormal">
hobbit 15776 5466 0 11:18:51 ? 0:00
/usr/local/apache2/bin/httpd -k start<o:p></o:p></p>
<p class="MsoNormal">
hobbit 4148 4141 1 17:02:41 ? 0:21
/export/xymon/server/bin/xymonnet --ping --checkresponse
--timeout=10 --dns-tim<o:p></o:p></p>
<p class="MsoNormal">
hobbit 4279 4278 0 17:03:01 ? 0:00 /bin/ksh
/usr/local/devmon/bin/restart.devmon<o:p></o:p></p>
<p class="MsoNormal">
hobbit 4135 4131 0 17:02:41 ? 0:00
xymond_channel --channel=client
--log=/var/log/xymon/clientdata.log xymond_clie<o:p></o:p></p>
<p class="MsoNormal">
hobbit 4140 4131 1 17:02:41 ? 0:21 xymonnet
--report --ping --checkresponse --timeout=10 --dns-timeout=2
--dnslog=<o:p></o:p></p>
<p class="MsoNormal">
hobbit 4141 4131 0 17:02:41 ? 0:00 /bin/sh
/export/xymon/server/ext/xymonnet-again.sh<o:p></o:p></p>
<p class="MsoNormal">
hobbit 4137 4131 0 17:02:41 ? 0:00
xymond_channel --channel=data
--log=/var/log/xymon/rrd-data.log xymond_rrd --rr<o:p></o:p></p>
<p class="MsoNormal">
hobbit 4144 4131 0 17:02:41 ? 0:00
xymond_channel --channel=data --log=/var/log/xymon/data.log
xymond_filestore --<o:p></o:p></p>
<p class="MsoNormal">
hobbit 4131 1 0 17:02:36 ? 0:00
/export/xymon-4.3.3/server/bin/xymonlaunch
--config=/export/xymon-4.3.3/server/<o:p></o:p></p>
<p class="MsoNormal">
hobbit 4136 4131 0 17:02:41 ? 0:00
xymond_channel --channel=status
--log=/var/log/xymon/rrd-status.log xymond_rrd<o:p></o:p></p>
<p class="MsoNormal">
hobbit 4133 4131 0 17:02:41 ? 0:00
xymond_channel --channel=stachg
--log=/var/log/xymon/history.log xymond_history<o:p></o:p></p>
<p class="MsoNormal">
hobbit 12912 12885 0 10:54:09 pts/7 0:00 -bash<o:p></o:p></p>
<p class="MsoNormal">
hobbit 4143 4131 0 17:02:41 ? 0:00 /bin/sh
/export/xymon-4.3.3/client/bin/xymonclient.sh<o:p></o:p></p>
<p class="MsoNormal">
hobbit 4289 4288 0 17:03:01 ? 0:00
/usr/bin/perl -w /usr/local/devmon/devmon<o:p></o:p></p>
<p class="MsoNormal">
hobbit 4139 4131 1 17:02:41 ? 0:21 xymongen
--recentgifs --subpagecolumns=2 --ignorecolumns=files
--tooltips=never<o:p></o:p></p>
<p class="MsoNormal">
hobbit 4134 4131 0 17:02:41 ? 0:00
xymond_channel --channel=page --log=/var/log/xymon/alert.log
xymond_alert --che<o:p></o:p></p>
<p class="MsoNormal">
hobbit 4138 4131 0 17:02:41 ? 0:00
xymond_channel --channel=clichg
--log=/var/log/xymon/hostdata.log xymond_hostda<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">The only other clue I’ve been able to find
is this note in the xymonlaunch.log file:<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">
15:29:24 pwsu020:/var/log/xymon> tail -50f xymonlaunch.log<o:p></o:p></p>
<p class="MsoNormal">
...<o:p></o:p></p>
<p class="MsoNormal">
2016-11-17 13:54:36 xymonlaunch starting<o:p></o:p></p>
<p class="MsoNormal">
2016-11-17 13:54:36 Loading tasklist configuration from
/export/xymon-4.3.3/server/etc/tasks.cfg<o:p></o:p></p>
<p class="MsoNormal">
2016-11-17 13:54:36 Loading hostnames<o:p></o:p></p>
<p class="MsoNormal">
2016-11-17 13:54:41 xgetenv: Cannot find value for variable
HOME<o:p></o:p></p>
<p class="MsoNormal">
2016-11-17 13:57:44 xgetenv: Cannot find value for variable
HOME<o:p></o:p></p>
<p class="MsoNormal">
2016-11-17 14:00:46 xgetenv: Cannot find value for variable
HOME<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:1.0in"><o:p> </o:p></p>
<p class="MsoNormal"> Yet, when I tried this, as well
as grep’ing through xymonlaunch “truss” output for HOME, I see
valid home directory values:<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:1.0in"><o:p> </o:p></p>
<p class="MsoNormal">
13:58:35 pwsu020:~> echo 'echo HOME=$HOME XYMSRV=$XYMSRV
XYMSERVERS=$XYMSERVERS XYMONDPORT=$XYMONDPORT' |
/home/hobbit/xymon/client/bin/xymoncmd<o:p></o:p></p>
<p class="MsoNormal">
2016-11-17 13:58:38 Using default environment file
/export/xymon-4.3.3/client/etc/xymonclient.cfg<o:p></o:p></p>
<p class="MsoNormal">
HOME=/home/hobbit XYMSRV=0.0.0.0 XYMSERVERS=10.235.57.11
10.235.157.56 XYMONDPORT=1984<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Help! <o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">david<o:p></o:p></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Courier New""><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Courier New"">~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Times New
Roman","serif"">David Mills<br>
Systems Administrator</span><br>
<b><i><span style="font-family:"Times New
Roman","serif";color:#1F33ED">Northrop
Grumman</span></i></b><span
style="font-family:"Times New
Roman","serif""><br>
(512) 595-1238 (mobile)<o:p></o:p></span></p>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Xymon mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Xymon@xymon.com">Xymon@xymon.com</a>
<a class="moz-txt-link-freetext" href="http://lists.xymon.com/mailman/listinfo/xymon">http://lists.xymon.com/mailman/listinfo/xymon</a>
</pre>
</blockquote>
<p><br>
</p>
</body>
</html>