[Xymon] WAN performance/monitoring

Adam Goryachev mailinglists at websitemanagers.com.au
Thu Jun 5 17:55:36 CEST 2014


On 05/06/14 16:13, Andy Smith wrote:
> Adam Goryachev wrote:
>> Hi All,
>>
>> First some background, then sharing some scripts I've written/used,
>> and finally asking for some advice please.
>>
>> Some time ago I was having a LAN issue (dropped packets) which I wrote
>> a small script to measure, and quantify the problem. (If you can't see
>> the problem, you can't fix it, and you can't prove it is fixed
>> afterwards).
>>
>> All the script did was use fping to ping a group of IP's once per
>> second, then every minute it would record a log of the date/time plus
>> one line for each IP that had one or more dropped packets. This worked
>> nicely for the above purpose, allowing me to easily pinpoint the
>> common machines experiencing the problem, and then eventually solve it.
>>
>> Now I'd like to extend the script to cover my WAN connections, but I
>> also need more information, and don't want to re-invent the wheel. So,
>> I'm looking for suggestions on how to implement what I need, and/or
>> other products that already do this.
> 
> Have you looked at smokeping :-
> 
> http://oss.oetiker.ch/smokeping/
> 
> This has its own presentation and alerting mechanisms, but we have a
> Xymon extension similar to
> https://wiki.xymonton.org/doku.php/monitors:bbsmokeping which integrates
> into a Xymon page so we can manage alerting and history.
> 

Thank you for the suggestion, it does look useful, however, similar to
MRTG, (perhaps I haven't looked enough) it gives a great overview, but
not sufficient level of detail to "see" transient errors.

In any case, I've modified my current script (and kept backwards
compatibility for the old log file format). It is definitely a lot
slower, but thanks to an off-list tip from someone it will now start the
test at the beginning of every minute, so processing time "doesn't
matter" as long as overall it completes in less than one minute.

I've also added some very basic xymon integration. I think the following
improvements could be made:
1) Lookup IP address using some xymon tool to get the hostname
2) Ask xymon for a list of IP's to test (perhaps using a new tag in the
bb-hosts file)
3) Use a better method to get the xymon environment, possibly even get
xymon to start the script with xymonlaunch like a normal ext script
4) Tidy up the code/optimise to improve efficiency. I make a few calls
to bc for floating point comparison/calculation, but there is probably a
better solution for this
6) Probably a better way to config the red/yellow/green levels within
xymon instead of hardcoding in the script. I'm not sure my version of
xymon supports all the new features from the current release (I'm still
on debian stable which is 4.3.0~beta2.dfsg-9.1, as an aside it would be
nice if a newer version could be uploaded to testing for the next release).
7) Use xymon to create the rrd files and graphs of the various values
(max/avg/min/loss). Probably seeing a graph with the first 3, and a
second graph with the loss value would provide a good idea of how well
the link is going.

If anyone has any suggestions or ideas, I'd be happy to hear them.
One thing I'm not sure of, but want to achieve is to be able to keep the
right amount of data so I can go back to the WAN supplier and say "link
X was not performing satisfactorily at time abc (eg, latency too high,
or packet loss too high, etc). At the moment the only way I get that is
from the text log files.

Regards,
Adam

-- 
Adam Goryachev
Website Managers
www.websitemanagers.com.au
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pingmon.sh
Type: application/x-shellscript
Size: 3498 bytes
Desc: not available
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20140606/850e3de3/attachment.bin>


More information about the Xymon mailing list