[hobbit] Current development plans

Thu Jun 9 18:54:28 CEST 2005

On Thu, Jun 09, 2005 at 07:41:42AM +0200, Henrik Stoerner wrote:
> It seems version 4.0 has reached some stability - there are still a few
> odd bug reports, but nothing that looks like major problems. So I
> thought it would be worthwhile to let you know what my plans are.

Well, lots of responses so I'll try to pick out some trends and respond 
to them in one place.

The Hobbit client
=================

Adam Goryachev :
> Personally, I'd most like to see a 'free' client (ie, GPL, without the
> BB license issue)

Right now the vote seems to be in favour of working on the client.
It certainly will be GPL.

Craig Cook :
> Instead of writing a hobbit client from scratch, it may be worth looking at
> modifying the Nagios client.

That might be an idea, yes. I haven't spent much time with Nagios, so I am not 
really familiar with their architecture. I had the impression it was more SNMP
focused than BB/Hobbit.

SNMP - or "How to collect data"
===============================
Adam Goryachev :
> I'd also like to see *much* better SNMP support.

Craig Cook :
> Building SNMP support into core hobbit would be a good idea, it is also non
> trivial.

Daniel J McDonald :
> I'd really like the bb-central approach.  Most of the status
> information can be grabbed from non-privileged accounts on all unix-like
> platforms.  I concede that a client is necessary in the windows world.

There's no doubt that some sort of support in Hobbit for collecting data via SNMP
would be very useful. However, I believe that's better implemented as a stand-alone
tool, somewhat like the network tester. It would obviously rely on some
library like Net-SNMP 5 for the dirty stuff of talking SNMP (meaning it would
support SNMP v1, v2c and v3 automatically - although I have at one point implemented
SNMP daemons and MIB instrumentation from scratch, I'd rather not repeat that 
excercise :-))

However, I don't want to base a Hobbit client on SNMP or any other central
polling-style method of data collection. There are at least two reasons for that:

1) It doesn't scale well.
My main setup has over 2000 boxes to monitor. Doing that from one central server
would mean polling 7 systems per second - that just won't work. There are always
some servers that are down causing timeouts... whether it happens via SNMP, ssh
or some other protocol really doesn't matter. It probably works fine for a setup
with 50 or even 100 systems, but not for me.

2) The central server needs to know about all kinds of systems
If my central polling server runs an "ssh hobbit at clientIP uptime;df;ps" - then 
the central server must know how to interpret the output. That's one of the 
major design problems in Hobbit currently - everytime Redhat^Wsomeone comes
up with a new layout for the "vmstat" output, Hobbit needs to be modified to
recognize these data.

So my idea currently is to design a new type of client. It won't generate "status"
messages, it will just collect data. Imagine a client that just sends Hobbit a
"client data" package, like

   os: Linux
   osver: 2.6.11
   osid: Debian/Sarge i386
   uptime: 173201 seconds
   loadaverage: 0.4
   filesystem /: 26102 MB, 71% used
   inodes /: 10291029 total, 21% used
   filesystem /var: 102400 MB, 50% used
   inodes /var: 40182910 total, 7% used

This is one well-defined format that Hobbit needs to recognize, and based on these
data it can match e.g. filesystem utilisation against a configuration file on the
Hobbit server and generate the necessary status messages - so the end result will
look exactly like what you have today, but with much less complexity in how e.g.
the RRD handlers need to know about the types of systems that report into Hobbit.

The only drawback is that the client becomes slightly more advanced but not much;
it's really just formatting the information differently before sending it off to
the Hobbit server.

Another very nice thing about this is that you can easily (well, relatively)
write Hobbit modules that handle new kinds of information.

And it can be done without breaking compatibility with the existing clients,
so you can run a mix of BB and Hobbit clients without any problems.

Encryption/authentication and compression of status messages
============================================================
Adam Goryachev :
> Finally, what about some sort of compression/encryption protocol, so
> that it is possible to do more frequent test/report without using so
> much bandwidth?

Daniel J McDonald :
> If we are building an extended protocol, we should support
> authentication as well.

There's already some IP-based access controls built into Hobbit; see the
hobbitd man-page for the --status-senders, --maint-senders, --www-senders,
--admin-senders options. The first one should be sufficient to block
most attempts at sending fake status messages - an attacker would need to
break into your network test server and send the fake messages from there.

However, authentication could be nice. I am tempted to handle both of these
problems with one solution - and just implement an SSL-encrypted protocol
where you can then use client-side certificates for authentication. That
will be significant overhead on the processing side, but the good thing 
is that you can offload SSL to hardware devices fairly easy (and OpenSSL
does support that kind of hardware).

Compression ... is it necessary ? All of the status messages in my setup
combined are about 6 MB for 2000 servers - ie. 3 KB/server which gets 
updated every 300 secs (on average). So that's 10 bytes/second per server.
So a rough bandwidth estimate for Hobbit would be 100 bps per server 
monitored. For a LAN, that's peanuts.

Well, thanks for the feedback - it's really good to learn what ideas and
problems are the important ones.

Regards,
Henrik