[hobbit] Future of Hobbit

Charles Jones jonescr at cisco.com
Fri Jan 25 20:43:41 CET 2008


I think Henriks stance on having the server collect data via ssh 
connections just doesn't scale.  Sure it works fine for a few dozen 
hosts, but let's say you have 2000 servers...now you are expecting be 
able to make 2000 trouble-free ssh connections before the next polling 
cycle begins. This introduces many problems:

* How many ssh sessions can you run at the same time without spiking the 
load on the hobbit server?
* What happens when an ssh session hangs (could hang the hobbit server, 
or make the poll cycle take too long)

You do know about the "pulldata" option?  It allows the Hobbit server to 
do a "pull" instead of waiting for client "push". This works fairly 
well, and I am using it in a production environment. I can see how it 
would not scale to well either though, for a really large number of hosts.

To picture the scalability, imagine a server that only has to receive 
updates from hobbit clients. All it has to do is listen on port 1984, 
and using relatively little CPU it can probably handle a constant flow 
of client updates.

Now imagine a server that has to go and fetch the client data itself. 
There is a LOT more overhead and processing involved in launching an 
outgoing ssh connection, running a remote client data-gathering command, 
waiting for the output, etc. Imagine 2000 of those firing off every 5 
minutes. How many simultaneous ssh sessions can your server handle?  
I've seen a server brought to its knees by a script that ran amok and 
was doing 50 simulataneous scp commands :)  Some time saving is done by 
using msgcache (no waiting for the data-gathering), but there is still 
the overhead of ssh itself, and having key-based ssh ability could be 
deemed a security risk (anyone who hacks into the hobbit server could 
then ssh to all of your client machines without a password).

A good solution would be an ssl-encrypted, bi-directional protocol. This 
would allow secure transfer of client data, either push or pull, without 
the overhead, management, and security risks of using ssh.

In the meantime, definitely check out the pulldata+msgcache option, as 
it sounds like it will do what you want.

-Charles

Tim Rotunda wrote:
> To answer Axel's what is it question.....its a Hobbit version of BB-Central,
> which runs on a central server like hobbit does.  It reaches out to the
> clients via ssh (or whatever) and collects data.  I did a shell script
> version a few years ago and it worked good until the client count topped
> 25-30.  Then I migrated it to C and it would handle 60+ nodes pretty well.
> Then I migrated that to a multi-threaded C process and it really smoked.  I
> never did reach the limit with that version.  I think they are still using
> it and adding nodes to the client list, which is prob over 250 or so.
>
> I was going to put it out to the community but my company would not allow it
> (idiots) so I couldn't.  I now work only 40 hours a week so now I have some
> time to myself and was thinking about rewriting it from memory and putting
> it out there.  I would put out the one that is threaded and it would prob
> just be for x86 Linux, which should build on Solaris, HP-UX, etc.
>   




More information about the Xymon mailing list