[hobbit] big brother replacement
thansmann at directpointe.com
Fri Nov 2 00:17:22 CET 2007
I'd be using Henrik's solution as follows, given your situation:
"I run two completely separate systems in parallel, and have the clients
report to both of them. The system at our disaster center has the paging
module disabled (just disable the [bbpage] section in hobbitlaunch.cfg),
to avoid double alerts - it is simple to activate it, if necessary.
"Config files are rsync'ed from the primary site to the disaster site
Though to be honest, this failover script may be something that can be
converted over to be used in hobbit. You might be better off going one
of a dozen different options that are slightly different than how you
have it setup, but that's up to you.
Hobbit doesn't have this built-in. That's for sure. I would think it's
fairly easy to use it to get much the same effect, though. I'll wait
for others responses on your situation and throw my own thoughts back in
From: Sloan [mailto:joe at tmsusa.com]
Sent: Thursday, November 01, 2007 5:03 PM
To: hobbit at hswn.dk
Subject: Re: [hobbit] big brother replacement
Tod Hansmann wrote:
> Let me see if I understand. You have several bb servers at one
> datacenter, each with their twin at the other datacenter, and both
> do the tests. They report to one central display server, but only one
> set reports at a time, depending on failover state, correct?
You have the basic idea, but there is no single central server, just
pairs of bb servers, one to a data center, in each lan which is being
monitored. For each pair of bb servers, only the server at data center A
does reporting, unless the server in data center B cannot reach the
server in data center A, in which case the server in data center B will
take over the reporting duties until the bb server in data center A
becomes reachable again. While this could theoretically lead to a split
brain condition, the failover condition has only ever triggered when
there was a wan outage.
> Is this failover automatic? If so, how is this failover determined?
> What if this failover has a false positive? If not, what is your
> timeframe to swap over?
IIRC It takes one bb cycle to kick in.
We've not seen a false positive, as I mentioned above.
It's just the standard built-in bb failover -
head ~bb/ext/failover follows:
# BIG BROTHER - FAILOVER SCRIPT
# Sean MacGuire
# (c) Copyright Quest Software, Inc. 1997-2003 All rights reserved.
# failover WATCHES BBNET and BBPAGER
# IF BBNET OR BBPAGER BECOMES UNAVAILABLE, THEN TAKE OVER UNTIL THEY
# To use, just add failover to the BBEXT variable in etc/bbdef.sh
# To configure BBPAGER failover:
# define both the primary and failover machines as BBPAGERS in
# and set bbwarn: FAILOVER in etc/bbwarnsetup.cfg
To unsubscribe from the hobbit list, send an e-mail to
hobbit-unsubscribe at hswn.dk
More information about the Xymon