[hobbit] New Hobbit stuff: Scalability and H/A work
    Beau Olivier 
    olivier.beau at telecomitalia.fr
       
    Thu Nov  2 08:51:20 CET 2006
    
    
  
oh oh oh !!! loook verrry interesting :)))
i have 2 questions:
-if the heavy work (rrd updates) is distributed on 2 or more dedicated "rrd servers",
the data wont be redondant on each of these "rrd server", right ?
-what would happen if an "rrd server" crashes ??
recently i changed my 2 hobbit servers, which now have 2G of ram each
would it make sense to have hobbit evolve and take full advantage of that memory ?
olivier
-----Message d'origine-----
De : Henrik Stoerner [mailto:henrik at hswn.dk]
Envoyé : mardi 31 octobre 2006 17:04
À : hobbit at hswn.dk
Objet : [hobbit] New Hobbit stuff: Scalability and H/A work
A couple of weeks ago, I was asked if our Hobbit system at work could
handle monitoring of one more customer. Of course, I said - no problem.
Well, there was one gotcha: This customer has 1100+ servers that need 
to be monitored. Which means my Hobbit installation is about to double 
in the number of hosts monitored. Hmm ...
This will be interesting to watch. I am fairly confident that Hobbit can
handle it, with one exception: The disks on my Hobbit server will be
overloaded. It already spends about 50% of it's time in I/O wait, so
doubling the number of hosts with cpu/memory/disk etc. graphs will
probably crash it.
So something needs to be done - fast: These hosts should go into our
Hobbit before Christmas. That is why I currently may seem a bit absent 
from the mailing list.
The way I plan to handle it will be by distributing the load of RRD
updates onto several servers, each handling a subset of the total set of
hosts; Hobbit will automatically detect which of the 3 RRD-servers
handles a specific host and direct rrd-updates to that server. New
hosts are distributed across all of the RRD servers in a weighted 
round-robin fashion.
The way this is going to be done means that it can be used not only for
distributing the load of the RRD file updates, but also for distributing
the other hobbitd_* modules (alerting, history logs, client data
processing etc). In other words, this will be a major win for Hobbit in
large installations.
It also has one more benefit: I think this can be evolved to handle
automatic failover, so you can run multiple Hobbit servers that process
the same data - meaning all of the on-disk data will be identical across all
of the Hobbit servers. This should make it possible to setup a group of
Hobbit servers for very high availability of the monitoring system. I 
haven't worked out all of the implementation details yet, but I think it
is possible.
Regards,
Henrik
To unsubscribe from the hobbit list, send an e-mail to
hobbit-unsubscribe at hswn.dk
    
    
More information about the Xymon
mailing list