[Xymon] diskstat.sh/RRD oddity

Vernon Everett everett.vernon at gmail.com
Thu Mar 29 03:46:41 CEST 2012


Hi Steve

I think this is the script that I wrote.
Apologies that it got you into a bit of a mess, but I am am quite thrilled
to hear that it worked as-is on RHEL. It was originally written for Solaris
10, and I made no effort to test it on anything else.

The other fix I would suggest is to change the sample time.
Change DURATION variable at the top.
The script takes a default 10-second "sample" of disk activity, and uses
that as those values for graphing.
It just may be that through a curious alignment of system times, that
multiple clients were sending their data to the Xymon server at the same
time as your system was sampling the IO.
Change the sample to something like 30 seconds, and you might find you get
a better average.

This was one of the risks with that script.
Make the sample time too long, and we see too much of an average - very
smooth graph.
Too short,and we might pick up peaks (as in your case).

What I was originally looking for when I wrote the script, was sustained
high IO, in which case, any sample size would have done the trick. So, for
me, 10 seconds was as good a value as any, but feel free to experiment.
YMMV.
If you find some settings give significantly better results than others,
feel free to add these notes to the Description or Known Bugs & Issues
sections on Xymonton.
And while you are there, if you can update the Compatibility entry to
include your OS, that would be great.

Regards
      Vernon



On 29 March 2012 06:02, Steve Holmes <sholmes42 at mac.com> wrote:

> This is just a comment on an oddity with respect to diskstat.sh and RRD.
>
> We make pretty heavy use of the diskstat.sh script, which I believe I
> downloaded from xymonton. When I installed it I used the standard
> clientlaunch.cfg stanza for the configuration and everything worked great.
>
> I was called to task today because we have been having some disk io issues
> on the RHEL VMs and someone was looking at the trend graphs for some
> servers to see if there was anything they could learn and they noticed that
> beginning at about 4pm local time on Monday the graphs for the number of
> sectors written per second on a couple of file systems on several VMs
> jumped from the 10 to 20 range to the 300 to 340 range and stayed there.
> The graph for number of disk writes per second had a corresponding jump up
> to about 40 or 50 from close to zero.
>
> In analyzing the data I discovered that the file system that was
> displaying this behavior is the same file system to which the diskstat.sh
> script is writing its temp files. It appears that for some reason, starting
> at 4pm on Monday the 5 minute test interval and the 5 minute average for
> RRD got in sync and all it was seeing was the data point that corresponded
> to its own writing activity and RRD was using it for the entire 5 minute
> average (of course, that's what RRD does).
>
> I 'fixed' it by changing the test interval to something less than 5
> minutes. I tried 2, 3, and 4 minutes and they all had the effect of
> reducing the data in the plot back to the expected level, i.e. to the level
> it was before 4pm on Monday.
>
> The mystery remains why it suddenly started seeing and using its own disk
> activity at the same time on several different servers.
>
> Steve Holmes
> ITaP/Purdue University
>
> --
> If they give you ruled paper, write the other way. -Juan Ramon Jimenez,
> poet, Nobel Prize in literature (1881-1958)
>
> I prayed for freedom for twenty years, but received no answer until I
> prayed with my legs. -Frederick Douglass, Former slave, abolitionist,
> editor, and orator (1817-1895)
>
>
> _______________________________________________
> Xymon mailing list
> Xymon at xymon.com
> http://lists.xymon.com/mailman/listinfo/xymon
>
>


-- 
"While it is futile to try to eliminate risk, and questionable to try to
minimize it, it is essential that the risks taken be the right risks. "
- Peter F. Drucker
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20120329/32e4c009/attachment.html>


More information about the Xymon mailing list