[hobbit] network test timeouts for hung TCP connects?

Henrik Stoerner henrik at hswn.dk
Wed Jan 26 23:29:13 CET 2005


On Wed, Jan 26, 2005 at 03:17:01PM -0700, Charles Jones wrote:
> My production BigBrother server is running BigBrother + bbgen 2.5 (I 
> know there is newer bbgen, I plan on replacing BB with a Hobbit 
> server).  

Wow, that's a pretty old bbgen version - 1œ years, in fact.

> My current bb+bbgen setup has problems whenever a machine dies 
> in such a way that it is pingable, but when you connect to any open TCP 
> port you get nothing back (usually caused by a memory error or 
> overheating).  When my current bb+bbgen setup tries to test one of these 
> machines that has zombified, it gets hung testing that host, and 
> eventually everything turns purple since  bb isn't updating anymore.
> 
> Does Hobbit have proper timeouts to timeout a hung TCP connection so 
> this sort of thing does not happen?

If not, then it's definitely a bug. All network tests done by Hobbit
must timeout if the other end doesn't respond. The default timeout is
10 seconds (set with the "--timeout=N" option to bbtest-net).

Looking back through the bbgen changelog, there are a couple of
bugfixes through the 2.x series that seem likely to fix it. But
without knowing exactly what's triggering this behaviour it is hard to
say for sure.


Henrik



More information about the Xymon mailing list