[Xymon] Monitoring iostat performance

Jeremy Laidman jlaidman at rebel-it.com.au
Thu Jan 30 01:55:34 CET 2014


On 30 January 2014 09:55, Lists <lists at benjamindsmith.com> wrote:

> Recently, we had a publicly visible outtage as a result of one of our load
> balancers exceeding the IOPS capability of its system drives.


Ouch!


> We'd like to extend xymon (currently installed on CentOS6 /32 with
> defaults) so that it can monitor IOPS for all servers.
>

I like this idea.  I looked into this quite a while ago, but really only
scratched the surface.


> Specifically, we'd like to see wrqm/s and probably %util. What's the most
> straightforward way to accomplish this? The other alternative is to create
> some form of internal script, which is doable but not preferable if there's
> an off-the-shelf tool available.
>

Whether an add-on or a new Xymon feature, this would almost certainly
require a new section in the client data.  There's already an [iostatdisk]
section used by Solaris and an [iostat] section used by "larrd", although
the format for the latter is a bit funky.  So you could replicate either of
these for Linux by adding this into xymonclient-linux.sh:

nohup sh -c "iostat -x 300 2 1>$XYMONTMP/xymon_iostatdisk.$MACHINEDOTS.$$
2>&1; mv $XYMONTMP/xymon_iostatdisk.$MACHINEDOTS.$$
$XYMONTMP/xymon_iostatdisk.$MACHINEDOTS" </dev/null >/dev/null 2>&1 &
if test -f $XYMONTMP/xymon_iostatdisk.$MACHINEDOTS; then echo
"[iostatdisk]"; cat $XYMONTMP/xymon_iostatdisk.$MACHINEDOTS; rm -f
$XYMONTMP/xymon_iostatdisk.$MACHINEDOTS; fi

We might want "-kx" rather than "-x" depending on potential uses.  But
doesn't matter for %util and wrqm/s.  Adding "-N" (for translating device
names to LVM mappings) might also be useful.

The Xymon parsing code has support only for Solaris.  That means it isn't
readily extensible.  For other client data sections, the parsing code
typically has a case statement that selects the OS and then parses
according to that.  Not the case for iostatdisk or iostat.

In fact, the function that does the parsing - do_iostatdisk_rrd() - is
never called anywhere.  So there's a fair bit of work required within Xymon
to get it to work.  I'd suggest we get the client side going, then writing
a server-side ext script to emulate the parsing code (feeding into a trends
message for RRD), and then start work on core support for iostatdisk within
xymond.

It's probably a bit more complicated than that.  Henrik may have a vision
for universal support of I/O statistics which may be incompatible with what
I'm proposing.  Also, we would probably want to maintain compatibility with
the existing [iostat] graph.cfg definition (the only one that uses the
iostat/iostatdisk results), and that means creating RRD files that are
consistent with the DS names and purposes already in use.  Also, we may
find that metrics we want to graph are inconsistent with metrics already
defined for the Solaris case that already exists.  Also, we'd need to
define a new graph to show the numbers you're interested in, because the
[iostat] graph only shows active/wait service times and %busy.  I think
%busy is analogous to %util.

Implementing this kind of thing in such a way that it supports the majority
of OSes, without too much effort, and without significant conflicts, is
quite a challenge.  I suspect that's the reason we don't have anything in
the way of I/O usage in Xymon.  I've often wondered if using "sar" is a
better way to go, because the output is more (but not completely)
consistent across platforms, and so the parsing code would be simpler and
smaller.  Sar is now available on more OSes than ever before, so we're more
likely to see support from hosts we monitor.  Clients would just do a few
standard "sar" commands to create client data sections (eg [sar-d] [sar-b],
or even [sar-A] for all available output) and Xymon would implement a small
handful of standardised "sar" parsers.  Just an idea.

J
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20140130/42d415af/attachment.html>


More information about the Xymon mailing list