[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

HA-Hobbit with Linux HA 2



Hi folks,

as written to the list a few days ago, we're running Hobbit on a Linux HA-2 cluster with DRBD. Runs very nice, and the setup is not very complicated, so as promised, here are some notes if you like to make your Hobbit high available: (This is more or less a summary of our installation, and not tested but written from memory - maybe someone can test it and write a real step-by-step guide)

1. Install Linux-HA on both nodes
2. Build a DRBD resource for the Hobbit files, as described in http://www.drbd.org/docs/about/
3. Setup a Hobbit group
4. Setup a failover IP resource in the Hobbit group, with a DNS entry if you have DNS 5. Install apache, and setup an apache failover resource in the Hobbit group for it 6. Setup a colocation constraint, and bind the Hobbit group to the DRBD resource
7. Setup an order constraint, to have DRBD started before Hobbit starts
8. Setup a Hobbit user as usual, and link its home directory to the DRBD resource. This way, all Hobbit config files, datafiles, etc. are on DRBD , and you don't have to fiddle around with syncing the configs etc.
9. Configure Hobbit sources as usual, with some notes:

- Use the failover IP as Hobbit's server IP
- Everywhere you would enter the Hobbitserver's name, enter the name of the failover IP (e.g., for the web-based stuff)

10. Try to start Hobbit: It should run as normal.
11. Stop it, switch everything to the other node, and try to start Hobbit again. If everything's okay, stop it, and you're ready for the final step:

12. Copy the attached script "hobbit" to /usr/lib/ocf/resource.d/heartbeat/ ON BOTH NODES

13. Setup a Hobbit resource (e.g. by using the hb_gui and selecting "hobbit", which should be visible after a restart of the GUI).

14. Start the Hobbit resource

That's it, now, your Hobbit should be redundant and monitored by the OCF system.

Some final notes:

- Be sure to use the failover IP as BBDISPLAY on all your clients
- NEVER start/stop/restart Hobbit with hobbit.sh, since you could end up with two running Hobbit instances (the cluster will see that hobbitd is down, and start it itself). reload is (obviously) okay, since it will not restart the process. - If you like to monitor your Hobbit nodes themselves, install a second Hobbit (client) instance under a different user (e.g. hobbitmon), which is not under cluster control. Without this, only the active node will be monitored, since the server (and the client) runs only on this node.

HTH,

hh

--
Harald Husemann
Netzwerk- und Systemadministrator
Operation Management Center (OMC)
MATERNA GmbH
Information & Communications

Westfalendamm 98
44141 Dortmund

Geschäftsführer: Dr. Winfried Materna, Helmut an de Meulen, Ralph Hartwig
Amtsgericht Dortmund HRB 5839

Tel: +49 231 9505 222
Fax: +49 231 9505 100
www.annyway.com <http://www.annyway.com/>
www.materna.com <http://www.materna.com/>
#!/bin/sh
#
#
#	Hobbit OCF RA. Start, stop, migrate and monitor hobbit
#

#######################################################################
# Initialization:

. ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs

#######################################################################

meta_data() {
	cat <<END
<?xml version="1.0"?>
<!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
<resource-agent name="Hobbit" version="1.0">
<version>1.0</version>

<longdesc lang="en">
This is an agent for the hobbit monitoring system.
</longdesc>
<shortdesc lang="en">Hobbit resource agent</shortdesc>

<parameters>
<parameter name="bbhome" unique="1">
<longdesc lang="en">
Location of the bbhome-directory, default is /home/hobbit/server
</longdesc>
<shortdesc lang="en">BBHOME-Directory</shortdesc>
<content type="string" default="/home/hobbit/server" />
</parameter>

</parameters>

<actions>
<action name="start"        timeout="90" />
<action name="stop"         timeout="100" />
<action name="monitor"      timeout="20" interval="10" depth="0" start-delay="30" />
<action name="reload"       timeout="90" />
<action name="migrate_to"   timeout="100" />
<action name="migrate_from" timeout="90" />
<action name="meta-data"    timeout="5" />
<action name="verify-all"   timeout="30" />
</actions>
</resource-agent>
END
}

#######################################################################

# don't exit on TERM, to test that lrmd makes sure that we do exit
#trap sigterm_handler TERM
#sigterm_handler() {
#	ocf_log info "They use TERM to bring us down. No such luck."
#	return
#}

hobbit_usage() {
	cat <<END
usage: $0 {start|stop|monitor|migrate_to|migrate_from|validate-all|meta-data}

Expects to have a fully populated OCF RA-compliant environment set.
END
}

hobbit_start() {
    hobbit_monitor
    if [ $? =  $OCF_SUCCESS ]; then
	ocf_log info "Hobbit already running"
	return $OCF_SUCCESS
    fi
    su - hobbit -c "${OCF_RESKEY_bbhome}/hobbit.sh start"
    hobbit_monitor
    if [ $? =  $OCF_SUCCESS ]; then
	ocf_log info "Hobbit started"
	return $OCF_SUCCESS
    else
        ocf_log error "Unable to start hobbit"
	return $OCF_ERR_GENERIC
    fi
}

hobbit_stop() {
    hobbit_monitor
    if [ $? =  $OCF_SUCCESS ]; then
        su - hobbit -c "${OCF_RESKEY_bbhome}/hobbit.sh stop"
	hobbit_monitor
        if [ $? =  $OCF_SUCCESS ]; then
            ocf_log error "Hobbit NOT stopped!"
	    return $OCF_ERR_GENERIC
	else
	   ocf_log info "Hobbit stopped"
	   return $OCF_SUCCESS
	fi
    else
       ocf_log info "Hobbit already stopped"
       return $OCF_SUCCESS
    fi
}

hobbit_monitor() {
	# Monitor _MUST!_ differentiate correctly between running
	# (SUCCESS), failed (ERROR) or _cleanly_ stopped (NOT RUNNING).
	# That is THREE states, not just yes/no.
	su - hobbit -c "/home/hobbit/server/hobbit.sh status 2>&1" | /bin/grep -q "Hobbit (hobbitlaunch) running with PID"
        if [ $? = 0 ]; then
	    return $OCF_SUCCESS
	else
	    return $OCF_NOT_RUNNING
	fi
}

hobbit_validate() {
    
    state_dir=`dirname "$OCF_RESKEY_state"`
    touch "$state_dir/$$"
    if [ -x $OCF_RESKEY_bbhome/hobbit.sh ]; then
	return $OCF_SUCCESS
    else
       return $OCF_ERR_ARGS
    fi
}

: ${OCF_RESKEY_bbhome=/home/hobbit/server}

case $__OCF_ACTION in
meta-data)	meta_data
		exit $OCF_SUCCESS
		;;
start)		hobbit_start;;
stop)		hobbit_stop;;
monitor)	hobbit_monitor;;
migrate_to)	ocf_log info "Migrating ${OCF_RESOURCE_INSTANCE} to ${OCF_RESKEY_CRM_meta_migrate_to}."
	        hobbit_stop
		;;
migrate_from)	ocf_log info "Migrating ${OCF_RESOURCE_INSTANCE} to ${OCF_RESKEY_CRM_meta_migrated_from}."
	        hobbit_start
		;;
reload)		ocf_log err "Reloading..."
	        hobbit_start
		;;
validate-all)	hobbit_validate;;
usage|help)	hobbit_usage
		exit $OCF_SUCCESS
		;;
*)		hobbit_usage
		exit $OCF_ERR_UNIMPLEMENTED
		;;
esac
rc=$?
ocf_log debug "${OCF_RESOURCE_INSTANCE} $__OCF_ACTION : $rc"
exit $rc