[Xymon] Bug in xymonping reporting wrong data when pinging multiple hosts
Michael Beatty
Michael.Beatty at sherwin.com
Tue Jan 8 19:49:01 CET 2013
Using Xymon 4.3.7
OS Linux SuSE
I've been struggling to understand why certain hosts are almost always
reporting the exact same ping response time. I've determined, that
xymonping isn't working, it is reporting incorrect data for half of the
hosts tested.
I start by pinging 6 hosts, one at a time, everything is correct
/[xymon at mxbscs tmp]$ /home/xymon/server/bin/xymonping X.X.X.22
X.X.X.22 is alive (0.06 ms)
[xymon at mxbscs tmp]$ /home/xymon/server/bin/xymonping X.X.X.70
X.X.X.70 is alive (0.56 ms)
[xymon at mxbscs tmp]$ /home/xymon/server/bin/xymonping X.X.X.138
X.X.X.138 is alive (826 ms)
[xymon at mxbscs tmp]$ /home/xymon/server/bin/xymonping X.X.X.137
X.X.X.137 is alive (980 ms)
[xymon at mxbscs tmp]$ /home/xymon/server/bin/xymonping X.X.X.201
X.X.X.201 is alive (0.75 ms)
[xymon at mxbscs tmp]$ /home/xymon/server/bin/xymonping X.X.X.202
X.X.X.202 is alive (0.66 ms)
/
Then, put them in the same command, the first, second, and fifth values
are wrong
/[xymon at mxbscs tmp]$ /home/xymon/server/bin/xymonping X.X.X.70 X.X.X.22
X.X.X.138 X.X.X.137 X.X.X.201 X.X.X.202
X.X.X.70 is alive (40 ms)
X.X.X.22 is alive (20 ms)
X.X.X.138 is alive (1307 ms)
X.X.X.137 is alive (1738 ms)
X.X.X.201 is alive (20 ms)
X.X.X.202 is alive (0.64 ms)/
Switch the order of the pings, the first, second, and fifth value are
exactly the same as the first time, and still wrong
[xymon at mxbscs tmp]$ /home/xymon/server/bin/xymonping X.X.X.201 X.X.X.202
X.X.X.137 X.X.X.138 X.X.X.70 X.X.X.22
X.X.X.201 is alive (40 ms)
X.X.X.202 is alive (20 ms)
X.X.X.137 is alive (1598 ms)
X.X.X.138 is alive (2069 ms)
X.X.X.70 is alive (20 ms)
X.X.X.22 is alive (0.04 ms)
[xymon at mxbscs tmp]$
Switch the order again, now the third, fourth, and fifth values are wrong.
/[xymon at mxbscs tmp]$ /home/xymon/server/bin/xymonping X.X.X.137
X.X.X.138 X.X.X.201 X.X.X.202 X.X.X.70 X.X.X.22
X.X.X.137 is alive (1537 ms)
X.X.X.138 is alive (2016 ms)
X.X.X.201 is alive (40 ms)
X.X.X.202 is alive (20 ms)
X.X.X.70 is alive (20 ms)
X.X.X.22 is alive (0.06 ms)/
Another thing I have noticed is that by altering the max-pps value, you
get completely different results.
[xymon at mxbscs tmp]$ /home/xymon/server/bin/xymonping X.X.X.137 X.X.X.138
X.X.X.201 X.X.X.202 X.X.X.70 X.X.X.22 --max-pps=1
X.X.X.137 is alive (2000 ms)
X.X.X.138 is alive (1000 ms)
X.X.X.201 is alive (2000 ms)
X.X.X.202 is alive (1000 ms)
X.X.X.70 is alive (1000 ms)
X.X.X.22 is alive (0.06 ms)
[xymon at mxbscs tmp]$ /home/xymon/server/bin/xymonping X.X.X.137 X.X.X.138
X.X.X.201 X.X.X.202 X.X.X.70 X.X.X.22 --max-pps=5
X.X.X.137 is alive (1500 ms)
X.X.X.138 is alive (1479 ms)
X.X.X.201 is alive (400 ms)
X.X.X.202 is alive (200 ms)
X.X.X.70 is alive (200 ms)
X.X.X.22 is alive (0.06 ms)
[xymon at mxbscs tmp]$ /home/xymon/server/bin/xymonping X.X.X.137 X.X.X.138
X.X.X.201 X.X.X.202 X.X.X.70 X.X.X.22 --max-pps=25
X.X.X.137 is alive (765 ms)
X.X.X.138 is alive (896 ms)
X.X.X.201 is alive (80 ms)
X.X.X.202 is alive (40 ms)
X.X.X.70 is alive (40 ms)
X.X.X.22 is alive (0.04 ms)
It doesn't appear to be a problem with my configuration. I checked the
www.xymon.com demo site, and there seems to be the same issue there. The
signature of the bad data is easy to see in the graphs as good data has
and diverse line, where as bad data is a generally flat line.
These hosts look good:
http://www.xymon.com/xymon-cgi/svcstatus.sh?HOST=pto.linuxbog.dk&SERVICE=conn
http://www.xymon.com/xymon-cgi/svcstatus.sh?HOST=dali.hswn.dk&SERVICE=conn
These hosts look bad:
http://www.xymon.com/xymon-cgi/svcstatus.sh?HOST=blixen.hswn.dk&SERVICE=conn
http://www.xymon.com/xymon-cgi/svcstatus.sh?HOST=wifi.hswn.dk&SERVICE=conn
--
Michael Beatty
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20130108/11d1e895/attachment.html>
More information about the Xymon
mailing list