[hobbit] bbtest-net output & poll time

Henrik Stoerner henrik at hswn.dk
Mon Apr 4 22:52:45 CEST 2005


On Mon, Apr 04, 2005 at 04:09:34PM -0400, Schwimmer, Eric E *HS wrote:

> I have a bb-hosts file with 1229 hosts in it, but no TCP
>  (http/ssh/ftp) enabled.
> In my hobbitlaunch.cfg, I have bbnet configured as follows:
> 
> CMD bbtest-net --report --ping --checkresponse --dns=ip --concurrency=1500 --dns-timeout=5 --timeout=5
> 
> The bbtest output for my hobbit server shows the following snippets:
>
> DNS statistics:
>  # hostnames resolved  :     1220
>  # succesful           :       25
>  # failed              :      975
>  # calls to dnsresolve :        0

> I thought that the --dns=ip option in the hobbitlaunch.cfg precluded
> any DNS resolution, but the bbtest output above seems to indicate
> the opposite.  Does anybody know what those DNS tests are the result
> of?

The heading for that bit of statistics is slightly misleading. It
really is just a summary of how many hostnames were converted into
IP-adresses; this can happen via DNS, but when you use --dns=ip it is
done entirely by using the IP's in the bb-hosts file. (The clue here
is that the "# calls to dnsresolve" is 0). As your timing statistics
show:

> DNS lookups completed       1112644659.080039          0.000387 

those 1000 IP's were found in less than one millisecond - that would
be quite a feat if any DNS was involved.

> Also, I'm a bit curious as to the chronlogical breakdown of the
>  bbtest events;

> Service definitions loaded        1112644638.545180       0.002916 
> Tests loaded                      1112644659.079652      20.534472 

> specifically the "Tests Loaded" section.  Does 20 seconds seem
> reasonable for a bb-hosts file our our size? 

No, it seems a bit much. My main server has about 1500 hosts in it,
and spends 4 seconds loading that file (actually, it is split into
about 30 files). If you look at the "bbgen" status, what's the time
reported for the "Load bbhosts done" line ?

I've looked over the code, but can't really spot anything that would
explain why it takes so long.


> (I'm trying desperately to get below the 60 second mark :)

Well, in that case perhaps you should lower the timeout on your
ping-tests - they account for 70% of the total time:

> PING test completed (1229 hosts)         1112644712.816387         53.726504 
> TIME TOTAL                                                         74.562619 

The ping tests are performed using "fping", so you may want to try and
play with fping options to control the timeout and # of retries it
does. You can add those to the FPING setting in hobbitserver.cfg.

Any particular reason you want to get below 60 seconds ? Are you aware
that Hobbit has a "re-test" script that performs more frequent tests
on hosts that go down ? When a network test begins to fail, Hobbit
puts that host on "frequent-test" list meaning that for up to 30
minutes that host will be tested once a minute rather than once every
5 minutes; so recoveries should be picked up faster.


Regards,
Henrik



More information about the Xymon mailing list