<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:st1="urn:schemas-microsoft-com:office:smarttags" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii">
<meta name=Generator content="Microsoft Word 11 (filtered medium)">
<!--[if !mso]>
<style>
v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style>
<![endif]--><o:SmartTagType
namespaceuri="urn:schemas-microsoft-com:office:smarttags" name="PostalCode"/>
<o:SmartTagType namespaceuri="urn:schemas-microsoft-com:office:smarttags"
name="State"/>
<o:SmartTagType namespaceuri="urn:schemas-microsoft-com:office:smarttags"
name="City"/>
<o:SmartTagType namespaceuri="urn:schemas-microsoft-com:office:smarttags"
name="Street"/>
<o:SmartTagType namespaceuri="urn:schemas-microsoft-com:office:smarttags"
name="address"/>
<o:SmartTagType namespaceuri="urn:schemas-microsoft-com:office:smarttags"
name="place"/>
<o:SmartTagType namespaceuri="urn:schemas-microsoft-com:office:smarttags"
name="PersonName"/>
<!--[if !mso]>
<style>
st1\:*{behavior:url(#default#ieooui) }
</style>
<![endif]-->
<style>
<!--
/* Font Definitions */
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman";}
a:link, span.MsoHyperlink
{color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{color:blue;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-reply;
font-family:Arial;
color:navy;}
@page Section1
{size:8.5in 11.0in;
margin:1.0in 1.25in 1.0in 1.25in;}
div.Section1
{page:Section1;}
-->
</style>
</head>
<body lang=EN-US link=blue vlink=blue>
<div class=Section1>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>I had this problem and then did the
adjustment. Since then, I get a 5 minute hole in load average and a couple of
other trends, even though in the solaris systems I have no problem using the
multi-cpu and zone process without any problems. Most of the time when the
hole shows up, I will get other missing 5 minute stats exactly one hour after
the first one and then does it two or three times. I have tried to disable the
caching, but it did not make a difference. The 4.3.0-2 beta seems to be very
broken and no one knows why. Right now, I trying to determine if I am better
off with another product, since issues do not seem to be a priority with
anyone.<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>Tom<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'><o:p> </o:p></span></font></p>
<div>
<div class=MsoNormal align=center style='text-align:center'><font size=3
face="Times New Roman"><span style='font-size:12.0pt'>
<hr size=2 width="100%" align=center tabindex=-1>
</span></font></div>
<p class=MsoNormal><b><font size=2 face=Tahoma><span style='font-size:10.0pt;
font-family:Tahoma;font-weight:bold'>From:</span></font></b><font size=2
face=Tahoma><span style='font-size:10.0pt;font-family:Tahoma'> Chris Naude
[mailto:chris.naude.0@gmail.com] <br>
<b><span style='font-weight:bold'>Sent:</span></b> Monday, January 18, 2010
6:47 PM<br>
<b><span style='font-weight:bold'>To:</span></b> <st1:PersonName w:st="on">hobbit@hswn.dk</st1:PersonName><br>
<b><span style='font-weight:bold'>Subject:</span></b> Re: [hobbit] False
Process Down Alerts</span></font><o:p></o:p></p>
</div>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'><o:p> </o:p></span></font></p>
<div>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'>I never received any alerts about messages being truncated. After
disabling the non prod clients i started receiving alerts about the messages
being truncated. I adjusted these values as specified below and they are good
now. Tomorrow i'll enable the non prod servers again and see if this is what
the original culprit was. Thanks!<o:p></o:p></span></font></p>
</div>
<div>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'><o:p> </o:p></span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-bottom:12.0pt'><font size=3
face="Times New Roman"><span style='font-size:12.0pt'><o:p> </o:p></span></font></p>
<div>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'>On Mon, Jan 18, 2010 at 12:41 PM, Williams, Doug (Consultant-RIC) <<a
href="mailto:Doug.Williams@rhd.com">Doug.Williams@rhd.com</a>> wrote:<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'>Seems to me your clients data is being truncated. Try modifying
this in<br>
your hobbitserver.cfg. You may want to set them appropriate size for<br>
your xymon server. I have xymon running on pretty beefy servers so I<br>
set these incredibly high, and even though they may exceed what xymon<br>
actually allows (but it is not hurting me). Restart hobbit server after<br>
making change to hobbitserver.cfg<br>
<br>
<br>
<br>
MAXMSG_STATUS=30000000<br>
MAXMSG_CLIENT=30000000<br>
MAXMSG_DATA=30000000<o:p></o:p></span></font></p>
<div>
<p class=MsoNormal style='margin-bottom:12.0pt'><font size=3
face="Times New Roman"><span style='font-size:12.0pt'><br>
<br>
-----Original Message-----<br>
From: Chris Naude [mailto:<a href="mailto:chris.naude.0@gmail.com">chris.naude.0@gmail.com</a>]<br>
Sent: Monday, January 18, 2010 2:21 PM<br>
To: <a href="mailto:hobbit@hswn.dk">hobbit@hswn.dk</a><br>
Subject: Re: [hobbit] False Process Down Alerts<br>
<br>
I've managed to stop the flood of false alerts. I removed all of my<br>
non-prod clients from the bb-hosts and shut off their client processes.<br>
The problem seems to be somehow related to the amount of data the Xymon<br>
server is trying to process.<br>
<br>
<br>
On Sun, Jan 17, 2010 at 5:08 PM, Chris Naude <<a
href="mailto:chris.naude.0@gmail.com">chris.naude.0@gmail.com</a>><br>
wrote:<br>
<br>
<br>
I have 7 clients running. Each client has a
different name. They<br>
are all sending data to the primary Xymon server. The alerts are reading<br>
missing processes, full file systems, and msgs errors. Here is another<br>
sample of an unusual error. You can see the process list has a funky<br>
break in it.<br>
<br>
<br>
Sun Jan 17 15:40:18 MST 2010 - Processes NOT ok<o:p></o:p></span></font></p>
</div>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'> yellow<<a
href="http://unixadmin.bestwestern.com/xymon/gifs/yellow.gif" target="_blank">http://unixadmin.bestwestern.com/xymon/gifs/yellow.gif</a>><o:p></o:p></span></font></p>
<div>
<div>
<p class=MsoNormal style='margin-bottom:12.0pt'><font size=3
face="Times New Roman"><span style='font-size:12.0pt'>Expected string COMMAND
not found in ps output header<br>
<br>
PID PPID USER<br>
STIM] <st1:place w:st="on">S PRI</st1:place>
%CPU TIME VSZ COMMAND<br>
0 0 root
Dec 14 S 127 0.16 00:40:00 0<br>
swapper<br>
1 0 root
Dec 14 R 152 0.09 00:01:21 2064 init<br>
48 0 root
Dec 14 S 152 0.00 00:00:00 0<br>
net_str_cached<br>
45 0 root
Dec 14 S 152 0.00 00:00:00 0<br>
net_str_cached<br>
42 0 root
Dec 14 S 152 0.00 00:00:00 0<br>
net_str_cached<br>
31 0 root
Dec 14 S 152 0.00 00:00:00 0<br>
net_str_cached<br>
30 0 root
Dec 14 S 152 0.00 00:00:00 0<br>
net_str_cached<br>
29 0 root
Dec 14 S 152 0.00 00:00:00 0<br>
net_str_cached<br>
28 0 root
Dec 14 S 152 0.00 00:00:00 0<br>
net_str_cached<br>
26 0 root
Dec 14 S 152 0.00 00:00:00 0<br>
net_str_cached<br>
5 0 root
Dec 14 R 152 0.00 00:00:02 0<br>
signald<br>
6 0 root
Dec 14 R 152 0.00 00:00:03 0<br>
kmemdaemon<br>
17 0 root
Dec 14 S 152 0.00 00:00:00 0<br>
net_str_cached<br>
16 0 root
Dec 14 S 152 0.00 00:00:00 0<br>
net_str_cached<br>
15 0 root
Dec 14 S 152 0.00 00:00:00 0<br>
net_str_cached<br>
14 0 root
Dec 14 S 152 0.00 00:00:00 0<br>
net_str_cached<br>
13 0 root
Dec 14 S 152 0.00 00:00:00 0<br>
net_str_cached<br>
12 0 root
Dec 14 S 152 0.00 00:00:00 0<br>
usbhubd<br>
11 0 root
Dec 14 R 152 0.00 00:01:11 0<br>
escsid<br>
10 0 root
Dec 14 S -32 0.00 00:00:00 0 ttisr<br>
9 0 root
Dec 14 R 152 0.00 00:01:27 0<br>
ksyncer_daemon<br>
<br>
7 0]root Dec 14
R 152<br>
0.00 00:]0:00 0 kai_daemon<br>
50 0 root
Dec 14 S 152 0.00 00:00:00 0<br>
net_str_cached<br>
47 0 root
Dec 14 S 152 0.00 00:00:00 0<br>
net_str_cached<br>
44 0 root
Dec 14 S 152 0.00 00:00:00 0<br>
net_str_cached<br>
41 0 root
Dec 14 S 152 0.00 00:00:00 0<br>
net_str_cached<br>
<br>
On Sun, Jan 17, 2010 at 4:21 PM, Josh Luthman<br>
<<a href="mailto:josh@imaginenetworksllc.com">josh@imaginenetworksllc.com</a>>
wrote:<br>
<br>
<br>
Is there only one client
sending data as this name? I<br>
don't think you answered Lars' email.<br>
<br>
What does the alert read
and what does the data say?<br>
Missing process? Too high of a load?<br>
<br>
Josh Luthman<br>
Office: 937-552-2340<br>
Direct: 937-552-2343<br>
<st1:Street w:st="on"><st1:address
w:st="on">1100 Wayne St</st1:address></st1:Street><br>
<st1:address w:st="on"><st1:Street
w:st="on">Suite</st1:Street> 1337</st1:address><br>
<st1:place w:st="on"><st1:City
w:st="on">Troy</st1:City>, <st1:State w:st="on">OH</st1:State> <st1:PostalCode
w:st="on">45373</st1:PostalCode></st1:place><br>
<br>
"The secret to
creativity is knowing how to hide your<br>
sources."<br>
--- Albert Einstein<br>
<br>
<br>
<br>
On Sun, Jan 17, 2010 at
6:11 PM, Chris Naude<br>
<<a href="mailto:chris.naude.0@gmail.com">chris.naude.0@gmail.com</a>>
wrote:<br>
<br>
<br>
The problem has suddenly become much much worse.<br>
I verified with tcpdump that the data coming from the client is 100%<br>
correct. It seems something on the Xymon server side is not handling the<br>
client data correctly. Anyone have any other ideas?<o:p></o:p></span></font></p>
</div>
</div>
<div>
<div>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'>
red 89% /testdb3 (37771472% used) has<br>
reached the PANIC level (95%)<br>
<br>
Filesystem 1024-blocks
Used<br>
Available Capacity Mounted on<br>
/dev/vgtestdb1/lvol1 107844344 70901816<br>
36942528 66% /testdb1<br>
/dev/vgtestdb2/lvol1 35962064 25453128<br>
10508936 71% /testdb2<br>
/dev/vgtestdb4/lvol1 970909400 825006344<br>
145903056 85% /testdb4<br>
/dev/vgtestdb3/lv<br>
l1 ] 338788224 301016752 37771472 89%<br>
/testdb3<br>
/dev/vgtestdb5/lvol1 179789048 150553912<br>
29235136 84% /testdb5<br>
/dev/vg00/lvol8 24580711 74501 24506210<br>
1% /home<br>
/dev/vg00/lvol4 10226680 6339283 3887397<br>
62% /opt<br>
<br>
<br>
On Sat, Jan 16, 2010 at 10:44 AM, Chris Naude<br>
<<a href="mailto:chris.naude.0@gmail.com">chris.naude.0@gmail.com</a>>
wrote:<br>
<br>
<br>
That makes a lot of sense. I did have<br>
some issues with the startup scripts on HP-UX. I'll check it out later<br>
tonight. Hopefully i can get it fixed before it goes live tonight.<br>
Thanks!<br>
<br>
<br>
On Sat, Jan 16, 2010 at 7:56 AM, Lars<br>
Ebeling <<a href="mailto:lars.ebeling@leopg9.no-ip.org">lars.ebeling@leopg9.no-ip.org</a>>
wrote:<br>
<br>
<br>
It looks like two
instances of<br>
the client are writing to the file at the same time or almost ;)<br>
<br>
<br>
Lars<br>
<br>
----- Original Message<br>
-----<br>
From: Chris Naude<o:p></o:p></span></font></p>
</div>
</div>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'><mailto:<a href="mailto:chris.naude.0@gmail.com">chris.naude.0@gmail.com</a>><o:p></o:p></span></font></p>
<div>
<div>
<p class=MsoNormal style='margin-bottom:12.0pt'><font size=3
face="Times New Roman"><span style='font-size:12.0pt'>
To: <a
href="mailto:hobbit@hswn.dk">hobbit@hswn.dk</a><br>
Sent: Saturday, January<br>
16, 2010 4:59 AM<br>
Subject: [hobbit] False<br>
Process Down Alerts<br>
<br>
I'm run into a strange<br>
problem with my Xymon server. I noticed today that I'm receiving random<br>
false alerts for processes being down. When I look at the process list<br>
output in the alert it looks as if the data coming from the clients<br>
isn't correct. Here is an example. Has anyone seen anything like this?<br>
<br>
9613 1944 root<br>
Jan 11 S 154 0.00 00:00:00 6128 cmclconfd -c<br>
10389 1944 root<br>
Jan 11 S 154 0.00 00:00:00 6128 cmclconfd -c<br>
9794 1 oracle<br>
10:55:57 S 154 0.00 00:00:0<br>
217600]oracleTEST<br>
(LOCAL=NO)<br>
1592 1 oracle<br>
Jan 11 S 154 0.00 00:00:11 217136 ora_mman_TEST<br>
12751 1944 root<br>
Jan 11 S 154 0.00 00:00:00 6128 cmclconfd -c<br>
8965 1944 root<br>
Jan 11 S 154 0.00 00:00:00 6128 cmclconfd -c<br>
<br>
11819 1 oracle<br>
Jan 12 S 154 0.00 00:00:07 217280 ora_j015_TEST<br>
2711 1 roo<br>
]ec 4 S 120<br>
0.04 00:02:16 868 /usr/sbin/xntpd<br>
3547 1 xymon<br>
Dec 4 S 168 0.00 00:00:43 268 /opt/xymon/client/bin/hobbitlaunch<br>
--config=/opt/xymon/client/etc/clientlaunch.cfg<br>
--log=/opt/xymon/client/logs/clientlaunch.log<br>
--pidfile=/opt/xymon/client/logs/clientlaunch.101.example.com.pid<br>
3728 1 root<br>
Dec 4 R 152 0.00 00:00:37 4208<br>
/usr/sbin/stm/uut/bin/tools/monitor/WbemWrapperMonitor<br>
<br>
<br>
Xymon version:<br>
4.3.0-0.beta2<br>
Xymon server: CentOS 5.4<br>
32 bit<br>
<br>
Client: HP-UX 11.31<br>
Itanium<br>
<br>
--<br>
Chris Naude<br>
<br>
<br>
<br>
<br>
<br>
--<br>
Chris Naude<br>
<br>
<br>
<br>
<br>
<br>
--<br>
Chris Naude<br>
<br>
<br>
<br>
<br>
<br>
<br>
--<br>
Chris Naude<br>
<br>
<br>
<br>
<br>
<br>
--<br>
Chris Naude<br>
<br>
<o:p></o:p></span></font></p>
</div>
</div>
<p class=MsoNormal style='margin-bottom:12.0pt'><font size=3
face="Times New Roman"><span style='font-size:12.0pt'>To unsubscribe from the
hobbit list, send an e-mail to<br>
<a href="mailto:hobbit-unsubscribe@hswn.dk">hobbit-unsubscribe@hswn.dk</a><br>
<br>
<o:p></o:p></span></font></p>
</div>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'><br>
<br clear=all>
<br>
-- <br>
Chris Naude<o:p></o:p></span></font></p>
</div>
</div>
</body>
</html>