turning up (way up) debugging for the conn tests
Tom Georgoulias
tomg at mcclatchyinteractive.com
Mon Jun 19 22:08:06 CEST 2006
Hi,
I've been trying to debug a seemingly false conn failure between one of
the interfaces of my Netapp filer and my Hobbit server. This problem
has persisted for a very long time, and I'm just not able to narrow down
or capture enough information to find out what is causing it.
I'm using Hobbit 4.1.2p1 on my server, and Data ONTAP 7.0.4 on my
FAS3020c filers.
My hobbit server is configured to monitor 2 IPs for 2 different
interfaces on the netapp filer. At seemingly random intervals, the conn
tests will fail for one of those interfaces, even though NFS continues
to work just fine out of that interface. After the interface "fails",
it'll either flap between green and red, or just stay red for 30 mins,
an hour, sometimes 4 hours. THere does not appears to be a regular
pattern or time period for this behavior.
Since tcpdump became my new buddy, I've noticed a few behaviors between
the hobbit server and the "failing" filer interface that I'd like to
better understand.
1. How may pings are sent by fping to each host, by default? If I
understand what the man page says, fping will send several before giving
up. If fping gets a reply from the first, does it continue to send more?
2. How can I turn up the debugging output for the conn tests, beyond
this line in my [bbnet] config in hobbitlaunch.cfg:
CMD bbtest-net --report --ping --checkresponse --debug --timing
Is there one I can add for bbretest, since it'll be handling the network
tests after the first conn failure?
3. Would a hobbit network test ever initiate a connection using UDP on
a high numbered port, like 37383?
Thanks for any help on this.
Tom
More information about the Xymon
mailing list