[hobbit] TCP/IP stats (bits/s) limited to 100M

Nicolas Dorfsman ndo at unikservice.com
Wed Jun 28 16:07:44 CEST 2006


Le 28 juin 06 à 15:42, Werner (gmail Lists) a écrit :

> Hi,
>
> 	The 110Mbits/s value you get, does really point to 32bit counter
> wrap, because with 32bit BYTE counter, measured every 5 minutes,
> 110Mbits/s is (aprox) the maximum you can count without wrapping the
> counter.
>
> 	As Henrik explained bellow, it should not be a "wrap" done by
> hobbit nor RRD. I'm think you need to look at you OS counters directly
> and see if they're wrapping in less than 5 minutes.
>
> 	If you on Solaris / SunOS you could use something like the
> bellow, and watch if any of the counters wraps 32 bit value  
> (4294967295
> if i recall correctly)

Correct.
Found this document which approves what you're saying :

http://sunsolve.sun.com/search/document.do?assetkey=1-25-72535-1


Le 28 juin 06 à 13:09, Henrik Stoerner a écrit :
> On Wed, Jun 28, 2006 at 11:42:14AM +0200, Beau Olivier wrote:
>> Hi,
>>
>> this looks like tcp-data going arround the 32bit counter problem...
>> are your counters 32 bit ? could you give us a copy of them ?
>
> The RRD files are created as "DERIVE" datatypes with a minimum  
> value of
> 0, which should handle 32/64-bit counter overflows automatically.
> (See the rrdcreate man-page).

Well...the man is not so confident :

              COUNTER
                  is for continuous incrementing counters like the
                  ifInOctets counter in a router. The COUNTER data
                  source assumes that the counter never decreases,
                  except when a counter overflows.  The update
                  function takes the overflow into account.  The
                  counter is stored as a per-second rate. When the
                  counter overflows, RRDtool checks if the
                  overflow happened at the 32bit or 64bit border
                  and acts accordingly by adding an appropriate
                  value to the result.

              DERIVE
                  will store the derivative of the line going from
                  the last to the current value of the data
                  source. This can be useful for gauges, for
                  example, to measure the rate of people entering
                  or leaving a room. Internally, derive works
                  exactly like COUNTER but without overflow
                  checks. So if your counter does not reset at 32
                  or 64 bit you might want to use DERIVE and
                  combine it with a MIN value of 0.

                  NOTE on COUNTER vs DERIVE
                      by Don Baarda <don.baarda at baesystems.com>

                      If you cannot tolerate ever mistaking the
                      occasional counter reset for a legitimate
                      counter wrap, and would prefer "Unknowns"
                      for all legitimate counter wraps and resets,
                      always use DERIVE with min=0. Otherwise,
                      using COUNTER with a suitable max will
                      return correct values for all legitimate
                      counter wraps, mark some counter resets as
                      "Unknown", but can mistake some counter
                      resets for a legitimate counter wrap.

                      For a 5 minute step and 32-bit counter, the
                      probability of mistaking a counter reset for
                      a legitimate wrap is arguably about 0.8% per
                      1Mbps of maximum bandwidth. Note that this
                      equates to 80% for 100Mbps interfaces, so
                      for high bandwidth interfaces and a 32bit
                      counter, DERIVE with min=0 is probably
                      preferable. If you are using a 64bit
                      counter, just about any max setting will
                      eliminate the possibility of mistaking a
                      reset for a counter wrap.


  In my particular case (and maybe in any large GigEth flow) COUNTER  
with max set to 4294967295 should be the solution


Le 28 juin 06 à 15:42, Werner (gmail Lists) a écrit :
>
> 	(CARE) With RRD it's possible to come around of this OS
> limitation by feeding the data in shorter times, lets say every 2
> minutes. RRD will take care of computing (making) the values "correct"
> for the steep size used to create the RRD (in hobbit's case 300 secs).
>
> 	I'm not exactly sure how /and or if hobbit will be happy in
> receiving client info quicker than every 5 minutes, but i think it
> should be transparent.


Mmmm. I'd prefer to try to fix the RRD file. May be tricky (export,  
import, etc), but more reliable.

>
> 	Hope this can give you some help.

it definitively helps, thanks !



Nicolas


More information about the Xymon mailing list