[hobbit] TCP/IP stats (bits/s) limited to 100M
Scott Walters
scott at PacketPushers.com
Sun Jul 9 22:02:32 CEST 2006
On Jul 9, 2006, at 12:18 PM, Henrik Stoerner wrote:
>
> OK, you got me on that one.
Not really, you inherited this ;) He is trying to get me, and his
point is valid, but the tool 'works as designed', read on . . .
>
> It seems that using COUNTER for the byte-counts in both the
> netstat- and ifstat-RRD's might be a good idea.
*might* being the operative word there
> The question then
> becomes "what's a suitable max" for these data ? Should I
> assume they are 32-bit counters ? I know some of them are not
> (e.g. Solaris has 64-bit counters for bytes in/out per interface).
exactly, and it is even more complicated than that . . . see below
>
> I'll change it to a counter now, with MAX set to "unknown". The
> overflow
> handling should still work correctly, if I understand the RRD
> docs right.
I would not recommend this. Another major issue is counter resets
instead of overflows (e.g reboot) get mistaken as wraps if the MAX is
not correct. From what I recall, if you use counter and anything
gets mistaken, you get a massive spike in the RRD making all the data
relatively useless because the y axis autoscales to the spike.
With DERIVE=0 you acknowledge you won't handle counter wraps
correctly (which are not that common anyway) but the result for all
wraps/resets are benign with the NaN, which does *not* cause a
spike. I am a firm believer in no data is better than bad data.
I am not opposing the ideal that COUNTER with correct max is the
'right way'. The problem with software that runs on so many
platforms is the correct max is impossible to know for certain.
Defining the MAX as just whatever 32/64 bits value is not adequate
because reboots will cause spikes, you'd need to now the MAX for the
particular metric and that is completely impossible to know
absolutely. inbytes MAX would need to be different for 10Mb/s 100
1000, Token Ring 16Mb/s, etc, etc.
DERIVE=0 and NaN is a much better compromise than the spikes. And I
would bet the farm reboots are a much more common event than counter
wraps for the majority of environments.
And Henrik, the net result to you will be answering an endless stream
of emails regarding why every COUNTER RRD has spikes . . . I've been
there, done that ;) I am almost 100% positive there is not *one*
counter RRD in the larrd stuff, all DERIVE. It's not impossible
rrdtool has changed to alleviate some of this, but from what I have
read of your email streams it I haven't seen anything to support that.
scott
More information about the Xymon
mailing list