dr for hobbit

Phil Wild philwild at gmail.com
Mon May 19 08:39:11 CEST 2008


Hi all,

I am redesigning the method we use for performing a failover to a disaster
recovery installation of hobbit. I am interested in opinions on the approach
and any shortcomings.

Note: This is not HA/clustering, it is for DR purposes.

We are aiming to have:

a production hobbit deployment
a DR hobbit deployment

clients will be configured to send metrics to both servers. which will keep
historical rrd data up to date etc.

The production server will be configured to send out alerts. The dr server
will not.

At regular intervals, rsync will be used to synchronise data from the
production server to the dr server, including the in memory checkpoint file.

In the event of a dr, the dr hobbit server will be promoted to active by
restarting hobbit, and loading the checkpoint and alert configurations.

I am expecting that this will ensure that the dr server will be "up to date"
with proudction as per the last checkpoint. This includes tests that have
been disabled or acknowledged.

Prior to failback to the production hobbit installation, the reverse of the
above would be performed.
An rsync of rrd data files would be performed to cover any windows where one
of the servers was offline for a period of time.

Is there anything wrong with this approach?

Cheers

Phil



-- 
Tel: 0400 466 952
Fax: 0433 123 226
email: philwild AT gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20080519/e9895581/attachment.html>


More information about the Xymon mailing list