[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [hobbit] network test timeouts for hung TCP connects?



On Wed, Jan 26, 2005 at 03:17:01PM -0700, Charles Jones wrote:
> My production BigBrother server is running BigBrother + bbgen 2.5 (I 
> know there is newer bbgen, I plan on replacing BB with a Hobbit 
> server).  

Wow, that's a pretty old bbgen version - 1Å years, in fact.

> My current bb+bbgen setup has problems whenever a machine dies 
> in such a way that it is pingable, but when you connect to any open TCP 
> port you get nothing back (usually caused by a memory error or 
> overheating).  When my current bb+bbgen setup tries to test one of these 
> machines that has zombified, it gets hung testing that host, and 
> eventually everything turns purple since  bb isn't updating anymore.
> 
> Does Hobbit have proper timeouts to timeout a hung TCP connection so 
> this sort of thing does not happen?

If not, then it's definitely a bug. All network tests done by Hobbit
must timeout if the other end doesn't respond. The default timeout is
10 seconds (set with the "--timeout=N" option to bbtest-net).

Looking back through the bbgen changelog, there are a couple of
bugfixes through the 2.x series that seem likely to fix it. But
without knowing exactly what's triggering this behaviour it is hard to
say for sure.


Henrik