[hobbit] Future of Hobbit
Charles Jones
jonescr at cisco.com
Fri Jan 25 20:43:41 CET 2008
I think Henriks stance on having the server collect data via ssh
connections just doesn't scale. Sure it works fine for a few dozen
hosts, but let's say you have 2000 servers...now you are expecting be
able to make 2000 trouble-free ssh connections before the next polling
cycle begins. This introduces many problems:
* How many ssh sessions can you run at the same time without spiking the
load on the hobbit server?
* What happens when an ssh session hangs (could hang the hobbit server,
or make the poll cycle take too long)
You do know about the "pulldata" option? It allows the Hobbit server to
do a "pull" instead of waiting for client "push". This works fairly
well, and I am using it in a production environment. I can see how it
would not scale to well either though, for a really large number of hosts.
To picture the scalability, imagine a server that only has to receive
updates from hobbit clients. All it has to do is listen on port 1984,
and using relatively little CPU it can probably handle a constant flow
of client updates.
Now imagine a server that has to go and fetch the client data itself.
There is a LOT more overhead and processing involved in launching an
outgoing ssh connection, running a remote client data-gathering command,
waiting for the output, etc. Imagine 2000 of those firing off every 5
minutes. How many simultaneous ssh sessions can your server handle?
I've seen a server brought to its knees by a script that ran amok and
was doing 50 simulataneous scp commands :) Some time saving is done by
using msgcache (no waiting for the data-gathering), but there is still
the overhead of ssh itself, and having key-based ssh ability could be
deemed a security risk (anyone who hacks into the hobbit server could
then ssh to all of your client machines without a password).
A good solution would be an ssl-encrypted, bi-directional protocol. This
would allow secure transfer of client data, either push or pull, without
the overhead, management, and security risks of using ssh.
In the meantime, definitely check out the pulldata+msgcache option, as
it sounds like it will do what you want.
-Charles
Tim Rotunda wrote:
> To answer Axel's what is it question.....its a Hobbit version of BB-Central,
> which runs on a central server like hobbit does. It reaches out to the
> clients via ssh (or whatever) and collects data. I did a shell script
> version a few years ago and it worked good until the client count topped
> 25-30. Then I migrated it to C and it would handle 60+ nodes pretty well.
> Then I migrated that to a multi-threaded C process and it really smoked. I
> never did reach the limit with that version. I think they are still using
> it and adding nodes to the client list, which is prob over 250 or so.
>
> I was going to put it out to the community but my company would not allow it
> (idiots) so I couldn't. I now work only 40 hours a week so now I have some
> time to myself and was thinking about rewriting it from memory and putting
> it out there. I would put out the one that is threaded and it would prob
> just be for x86 Linux, which should build on Solaris, HP-UX, etc.
>
More information about the Xymon
mailing list