turning up (way up) debugging for the conn tests

Tom Georgoulias tomg at mcclatchyinteractive.com
Mon Jun 19 22:08:06 CEST 2006


Hi,

I've been trying to debug a seemingly false conn failure between one of 
the interfaces of my Netapp filer and my Hobbit server.  This problem 
has persisted for a very long time, and I'm just not able to narrow down 
or capture enough information to find out what is causing it.

I'm using Hobbit 4.1.2p1 on my server, and Data ONTAP 7.0.4 on my 
FAS3020c filers.

My hobbit server is configured to monitor 2 IPs for 2 different 
interfaces on the netapp filer.  At seemingly random intervals, the conn 
tests will fail for one of those interfaces, even though NFS continues 
to work just fine out of that interface.  After the interface "fails", 
it'll either flap between green and red, or just stay red for 30 mins, 
an hour, sometimes 4 hours.  THere does not appears to be a regular 
pattern or time period for this behavior.

Since tcpdump became my new buddy, I've noticed a few behaviors between 
the hobbit server and the "failing" filer interface that I'd like to 
better understand.

1.  How may pings are sent by fping to each host, by default?  If I 
understand what the man page says, fping will send several before giving 
up.  If fping gets a reply from the first, does it continue to send more?

2. How can I turn up the debugging output for the conn tests, beyond 
this line in my [bbnet] config in hobbitlaunch.cfg:

CMD bbtest-net --report --ping --checkresponse --debug --timing

Is there one I can add for bbretest, since it'll be handling the network 
tests after the first conn failure?

3.  Would a hobbit network test ever initiate a connection using UDP on 
a high numbered port, like 37383?

Thanks for any help on this.

Tom



More information about the Xymon mailing list