[hobbit] bbtest-net output & poll time
Henrik Stoerner
henrik at hswn.dk
Mon Apr 4 22:52:45 CEST 2005
On Mon, Apr 04, 2005 at 04:09:34PM -0400, Schwimmer, Eric E *HS wrote:
> I have a bb-hosts file with 1229 hosts in it, but no TCP
> (http/ssh/ftp) enabled.
> In my hobbitlaunch.cfg, I have bbnet configured as follows:
>
> CMD bbtest-net --report --ping --checkresponse --dns=ip --concurrency=1500 --dns-timeout=5 --timeout=5
>
> The bbtest output for my hobbit server shows the following snippets:
>
> DNS statistics:
> # hostnames resolved : 1220
> # succesful : 25
> # failed : 975
> # calls to dnsresolve : 0
> I thought that the --dns=ip option in the hobbitlaunch.cfg precluded
> any DNS resolution, but the bbtest output above seems to indicate
> the opposite. Does anybody know what those DNS tests are the result
> of?
The heading for that bit of statistics is slightly misleading. It
really is just a summary of how many hostnames were converted into
IP-adresses; this can happen via DNS, but when you use --dns=ip it is
done entirely by using the IP's in the bb-hosts file. (The clue here
is that the "# calls to dnsresolve" is 0). As your timing statistics
show:
> DNS lookups completed 1112644659.080039 0.000387
those 1000 IP's were found in less than one millisecond - that would
be quite a feat if any DNS was involved.
> Also, I'm a bit curious as to the chronlogical breakdown of the
> bbtest events;
> Service definitions loaded 1112644638.545180 0.002916
> Tests loaded 1112644659.079652 20.534472
> specifically the "Tests Loaded" section. Does 20 seconds seem
> reasonable for a bb-hosts file our our size?
No, it seems a bit much. My main server has about 1500 hosts in it,
and spends 4 seconds loading that file (actually, it is split into
about 30 files). If you look at the "bbgen" status, what's the time
reported for the "Load bbhosts done" line ?
I've looked over the code, but can't really spot anything that would
explain why it takes so long.
> (I'm trying desperately to get below the 60 second mark :)
Well, in that case perhaps you should lower the timeout on your
ping-tests - they account for 70% of the total time:
> PING test completed (1229 hosts) 1112644712.816387 53.726504
> TIME TOTAL 74.562619
The ping tests are performed using "fping", so you may want to try and
play with fping options to control the timeout and # of retries it
does. You can add those to the FPING setting in hobbitserver.cfg.
Any particular reason you want to get below 60 seconds ? Are you aware
that Hobbit has a "re-test" script that performs more frequent tests
on hosts that go down ? When a network test begins to fail, Hobbit
puts that host on "frequent-test" list meaning that for up to 30
minutes that host will be tested once a minute rather than once every
5 minutes; so recoveries should be picked up faster.
Regards,
Henrik
More information about the Xymon
mailing list