[Xymon] Bug in xymonping reporting wrong data when pinging multiple hosts
Michael Beatty
Michael.Beatty at sherwin.com
Wed Jan 9 13:36:31 CET 2013
Installed FPing also, working fine. For whatever reason it installed
with rwxr-xr-x permissions, needed to chmod u+s for it to work.
In some more testing, the xymonping does work to report hosts that are
failed. So it is still effective for alerting purposes, but from a
reporting standpoint, not so much.
Michael Beatty
On 01/08/2013 06:57 PM, Jeremy Laidman wrote:
> Of course the solution is to use fping. Henrik has previously stated
> that fping is preferred over xymonping
> <http://lists.xymon.com/archive/2012-January/033738.html>, and the
> fping.sh script used when building will warn that "it is not yet fully
> stable".
>
> I've just now installed fping (and configured Xymon to use it) and my
> graphs are now showing much more reasonable values than before.
>
> Nevertheless, it's not obvious in any documentation that xymonping
> will give bad data. The caveats on its use suggest (to me) that it
> can miss some replies when large numbers of hosts are probed, but in
> practice it gives bad data even when the number of hosts is two.
>
> J
>
>
>
> On 9 January 2013 10:39, Jeremy Laidman <jlaidman at rebel-it.com.au
> <mailto:jlaidman at rebel-it.com.au>> wrote:
>
> Yup, I get this too, tested with v4.3.10 and v4.3.4. It also
> shows up when I ping the localhost address repeatedly:
>
> sudo ./xymon-4.3.4/xymonnet/xymonping 127.0.0.1 127.0.0.1
> 127.0.0.1 127.0.0.1 127.0.0.1
> 127.0.0.1 is alive (20 ms)
> 127.0.0.1 is alive (0.02 ms)
> 127.0.0.1 is alive (24 ms)
> 127.0.0.1 is alive (0.02 ms)
> 127.0.0.1 is alive (0.02 ms)
>
> The 20ms and 24ms entries are wrong, and they change as I adjust
> the max-pps values, by a factor of 5.
>
> None of my conn graphs seems to be completely flatlined, but I
> have noticed that DNS test times are usually less than conn test
> times, which is a bit odd, but might be unrelated. Hmm, now that
> I look at them, it seems all of my graphs but one are hovering
> close to either 24ms or 48ms. The host that is the exception,
> with a conn graph that looks correct, happens to be the last entry
> if I sort all host IP addresses.
>
> J
>
>
>
> On 9 January 2013 05:49, Michael Beatty
> <Michael.Beatty at sherwin.com <mailto:Michael.Beatty at sherwin.com>>
> wrote:
>
> Using Xymon 4.3.7
> OS Linux SuSE
>
> I've been struggling to understand why certain hosts are
> almost always reporting the exact same ping response time.
> I've determined, that xymonping isn't working, it is reporting
> incorrect data for half of the hosts tested.
>
> I start by pinging 6 hosts, one at a time, everything is correct
> /[xymon at mxbscs tmp]$ /home/xymon/server/bin/xymonping X.X.X.22
> X.X.X.22 is alive (0.06 ms)
> [xymon at mxbscs tmp]$ /home/xymon/server/bin/xymonping X.X.X.70
> X.X.X.70 is alive (0.56 ms)
> [xymon at mxbscs tmp]$ /home/xymon/server/bin/xymonping X.X.X.138
> X.X.X.138 is alive (826 ms)
> [xymon at mxbscs tmp]$ /home/xymon/server/bin/xymonping X.X.X.137
> X.X.X.137 is alive (980 ms)
> [xymon at mxbscs tmp]$ /home/xymon/server/bin/xymonping X.X.X.201
> X.X.X.201 is alive (0.75 ms)
> [xymon at mxbscs tmp]$ /home/xymon/server/bin/xymonping X.X.X.202
> X.X.X.202 is alive (0.66 ms)
> /
>
> Then, put them in the same command, the first, second, and
> fifth values are wrong
> /[xymon at mxbscs tmp]$ /home/xymon/server/bin/xymonping X.X.X.70
> X.X.X.22 X.X.X.138 X.X.X.137 X.X.X.201 X.X.X.202
> X.X.X.70 is alive (40 ms)
> X.X.X.22 is alive (20 ms)
> X.X.X.138 is alive (1307 ms)
> X.X.X.137 is alive (1738 ms)
> X.X.X.201 is alive (20 ms)
> X.X.X.202 is alive (0.64 ms)/
>
>
> Switch the order of the pings, the first, second, and fifth
> value are exactly the same as the first time, and still wrong
> [xymon at mxbscs tmp]$ /home/xymon/server/bin/xymonping X.X.X.201
> X.X.X.202 X.X.X.137 X.X.X.138 X.X.X.70 X.X.X.22
> X.X.X.201 is alive (40 ms)
> X.X.X.202 is alive (20 ms)
> X.X.X.137 is alive (1598 ms)
> X.X.X.138 is alive (2069 ms)
> X.X.X.70 is alive (20 ms)
> X.X.X.22 is alive (0.04 ms)
> [xymon at mxbscs tmp]$
>
> Switch the order again, now the third, fourth, and fifth
> values are wrong.
> /[xymon at mxbscs tmp]$ /home/xymon/server/bin/xymonping
> X.X.X.137 X.X.X.138 X.X.X.201 X.X.X.202 X.X.X.70 X.X.X.22
> X.X.X.137 is alive (1537 ms)
> X.X.X.138 is alive (2016 ms)
> X.X.X.201 is alive (40 ms)
> X.X.X.202 is alive (20 ms)
> X.X.X.70 is alive (20 ms)
> X.X.X.22 is alive (0.06 ms)/
>
>
> Another thing I have noticed is that by altering the max-pps
> value, you get completely different results.
> [xymon at mxbscs tmp]$ /home/xymon/server/bin/xymonping X.X.X.137
> X.X.X.138 X.X.X.201 X.X.X.202 X.X.X.70 X.X.X.22 --max-pps=1
> X.X.X.137 is alive (2000 ms)
> X.X.X.138 is alive (1000 ms)
> X.X.X.201 is alive (2000 ms)
> X.X.X.202 is alive (1000 ms)
> X.X.X.70 is alive (1000 ms)
> X.X.X.22 is alive (0.06 ms)
>
> [xymon at mxbscs tmp]$ /home/xymon/server/bin/xymonping X.X.X.137
> X.X.X.138 X.X.X.201 X.X.X.202 X.X.X.70 X.X.X.22 --max-pps=5
> X.X.X.137 is alive (1500 ms)
> X.X.X.138 is alive (1479 ms)
> X.X.X.201 is alive (400 ms)
> X.X.X.202 is alive (200 ms)
> X.X.X.70 is alive (200 ms)
> X.X.X.22 is alive (0.06 ms)
>
> [xymon at mxbscs tmp]$ /home/xymon/server/bin/xymonping X.X.X.137
> X.X.X.138 X.X.X.201 X.X.X.202 X.X.X.70 X.X.X.22 --max-pps=25
> X.X.X.137 is alive (765 ms)
> X.X.X.138 is alive (896 ms)
> X.X.X.201 is alive (80 ms)
> X.X.X.202 is alive (40 ms)
> X.X.X.70 is alive (40 ms)
> X.X.X.22 is alive (0.04 ms)
>
>
> It doesn't appear to be a problem with my configuration. I
> checked the www.xymon.com <http://www.xymon.com> demo site,
> and there seems to be the same issue there. The signature of
> the bad data is easy to see in the graphs as good data has and
> diverse line, where as bad data is a generally flat line.
> These hosts look good:
> http://www.xymon.com/xymon-cgi/svcstatus.sh?HOST=pto.linuxbog.dk&SERVICE=conn
> http://www.xymon.com/xymon-cgi/svcstatus.sh?HOST=dali.hswn.dk&SERVICE=conn
>
> These hosts look bad:
> http://www.xymon.com/xymon-cgi/svcstatus.sh?HOST=blixen.hswn.dk&SERVICE=conn
> http://www.xymon.com/xymon-cgi/svcstatus.sh?HOST=wifi.hswn.dk&SERVICE=conn
>
>
>
> --
> Michael Beatty
>
>
> _______________________________________________
> Xymon mailing list
> Xymon at xymon.com <mailto:Xymon at xymon.com>
> http://lists.xymon.com/mailman/listinfo/xymon
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20130109/28bce63f/attachment.html>
More information about the Xymon
mailing list