[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [hobbit] TCP/IP stats (bits/s) limited to 100M



Le 28 juin 06 à 15:42, Werner (gmail Lists) a écrit :

Hi,

	The 110Mbits/s value you get, does really point to 32bit counter
wrap, because with 32bit BYTE counter, measured every 5 minutes,
110Mbits/s is (aprox) the maximum you can count without wrapping the
counter.

	As Henrik explained bellow, it should not be a "wrap" done by
hobbit nor RRD. I'm think you need to look at you OS counters directly
and see if they're wrapping in less than 5 minutes.

If you on Solaris / SunOS you could use something like the
bellow, and watch if any of the counters wraps 32 bit value (4294967295
if i recall correctly)

Correct. Found this document which approves what you're saying :

http://sunsolve.sun.com/search/document.do?assetkey=1-25-72535-1


Le 28 juin 06 à 13:09, Henrik Stoerner a écrit :
On Wed, Jun 28, 2006 at 11:42:14AM +0200, Beau Olivier wrote:
Hi,

this looks like tcp-data going arround the 32bit counter problem...
are your counters 32 bit ? could you give us a copy of them ?

The RRD files are created as "DERIVE" datatypes with a minimum value of
0, which should handle 32/64-bit counter overflows automatically.
(See the rrdcreate man-page).

Well...the man is not so confident :

             COUNTER
                 is for continuous incrementing counters like the
                 ifInOctets counter in a router. The COUNTER data
                 source assumes that the counter never decreases,
                 except when a counter overflows.  The update
                 function takes the overflow into account.  The
                 counter is stored as a per-second rate. When the
                 counter overflows, RRDtool checks if the
                 overflow happened at the 32bit or 64bit border
                 and acts accordingly by adding an appropriate
                 value to the result.

             DERIVE
                 will store the derivative of the line going from
                 the last to the current value of the data
                 source. This can be useful for gauges, for
                 example, to measure the rate of people entering
                 or leaving a room. Internally, derive works
                 exactly like COUNTER but without overflow
                 checks. So if your counter does not reset at 32
                 or 64 bit you might want to use DERIVE and
                 combine it with a MIN value of 0.

                 NOTE on COUNTER vs DERIVE
                     by Don Baarda <don.baarda (at) baesystems.com>

                     If you cannot tolerate ever mistaking the
                     occasional counter reset for a legitimate
                     counter wrap, and would prefer "Unknowns"
                     for all legitimate counter wraps and resets,
                     always use DERIVE with min=0. Otherwise,
                     using COUNTER with a suitable max will
                     return correct values for all legitimate
                     counter wraps, mark some counter resets as
                     "Unknown", but can mistake some counter
                     resets for a legitimate counter wrap.

                     For a 5 minute step and 32-bit counter, the
                     probability of mistaking a counter reset for
                     a legitimate wrap is arguably about 0.8% per
                     1Mbps of maximum bandwidth. Note that this
                     equates to 80% for 100Mbps interfaces, so
                     for high bandwidth interfaces and a 32bit
                     counter, DERIVE with min=0 is probably
                     preferable. If you are using a 64bit
                     counter, just about any max setting will
                     eliminate the possibility of mistaking a
                     reset for a counter wrap.


In my particular case (and maybe in any large GigEth flow) COUNTER with max set to 4294967295 should be the solution



Le 28 juin 06 à 15:42, Werner (gmail Lists) a écrit :

(CARE) With RRD it's possible to come around of this OS limitation by feeding the data in shorter times, lets say every 2 minutes. RRD will take care of computing (making) the values "correct" for the steep size used to create the RRD (in hobbit's case 300 secs).

	I'm not exactly sure how /and or if hobbit will be happy in
receiving client info quicker than every 5 minutes, but i think it
should be transparent.


Mmmm. I'd prefer to try to fix the RRD file. May be tricky (export, import, etc), but more reliable.


Hope this can give you some help.

it definitively helps, thanks !



Nicolas