[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [hobbit] Advice on how to handle HA monitoring

To: hobbit (at) hswn.dk
Subject: Re: [hobbit] Advice on how to handle HA monitoring
From: Charles Jones <jonescr (at) cisco.com>
Date: Fri, 21 Sep 2007 13:06:06 -0700
Authentication-results: sj-dkim-2; header.From=jonescr (at) cisco.com; dkim=pass ( sig from cisco.com/sjdkim2002 verified; );
Dkim-signature: v=0.5; a=rsa-sha256; q=dns/txt; l=3141; t=1190405246; x=1191269246; c=relaxed/simple; s=sjdkim2002; h=Content-Type:From:Subject:Content-Transfer-Encoding:MIME-Version; d=cisco.com; i=jonescr (at) cisco.com; z=From:=20Charles=20Jones=20<jonescr (at) cisco.com> |Subject:=20Re=3A=20[hobbit]=20Advice=20on=20how=20to=20handle=20HA=20mon itoring |Sender:=20; bh=8OAKXA+Zj3RQki24eDJxkEZuO1mGjMkCJRBqUvM3Agk=; b=YKQFyVZWoKHIblQUedVLknju60ChtqRwKeYjAE+RS8eSD04tpkPmdnuD/LBHoY0HmiyW2nvq vx9XatAur/bcdkz1Sy1RwSCCwuUPu9BwORYX0rlSSn1UJdTO3Ifr4y8W;
References: <46F4176E.9030306 (at) cisco.com> <9836EA7D7FDAE34099AED87A2D9C3A8D9894EC (at) 306181ANEX2.global.avaya.com>
User-agent: Thunderbird 2.0.0.5 (X11/20070719)

So you are using a custom script to monitor instead of hobbitclient-*.sh ?

This really isn't an option for me, since I have literally dozens ofservers that all have the same hobbit homedir, although I have writtensome custom scripts that check the hostname and only run if they arelaunched on the host that they need to be on.

I think I will just have the oncall persons manually edithobbit-clients.cfg in the case of a failover (oncall gets woken upanyhow). They can just uncomment/comment definitions for whichever hostis the master.

It would be nice if you could set dependencies for PROC tests, then Icould just make all of the PROC tests dependant upon something, like oneof the failover daemons, or a flag on the filesystem, etc.


-Charles

Haertig, David F (Dave) wrote:

I do this with a custom monitoring script (I don't use the standard
Hobbit 'procs' test).

There should be something you can check via script that tells you if a
server is primary or not.  In my case, a database filesystem is mounting
on the primary but not on the secondary.  So my script uses 'df' to look
for that filesystem.  You could use 'mount' as well.  If that database
filesystem is mounted the script does the normal test for processes and
reports red/green.  But if it's not mounted, the script reports a clear

condition.

-----Original Message-----

From: Charles Jones [mailto:jonescr (at) cisco.com]Sent: Friday, September 21, 2007 1:12 PM

To: hobbit (at) hswn.dk
Subject: [hobbit] Advice on how to handle HA monitoring

We have 2 hosts, HostA and HostB. They are part of an HA cluster via HP
ServiceGuard. There is a virtual IP and DNS name of "virtual" that
automatically goes to whichever of HostA and HostB is the primary at the
time.

I am currently monitoring both HostA and HostB via Hobbit.  Currently
HostA is the primary, and I am doing various PROC checks. Currently on
HostB, I am not doing process checks.

My problem is, how do I smoothly handle a failover scenario (HostB
becoming the primary)?  When a failover occurs, all of the procs on
HostA are stopped (either by the server crashing, or manualy by
ServiceGuard), and the same procs are started up on HostB.

I'm trying to think of ways to monitor both hosts, but only monitor
procs on the one that is primary. So far the best I can come up with is
to run the hobbit clients in local mode, and maybe have the ServiceGuard
scripts swap out the config files and restart the Hobbit clients when
there is a failover. That would probably work, BUT in this case the
Hobbit homdir is also the same (SAN mount) on both machines, so moving
or editing a file on one does the same on the other :(

Simply shutting down the hobbit client on the non-primary is not an
option, as then it would no longer be monitored at all.

Any ideas? :)

-Charles

To unsubscribe from the hobbit list, send an e-mail to
hobbit-unsubscribe (at) hswn.dk

To unsubscribe from the hobbit list, send an e-mail to
hobbit-unsubscribe (at) hswn.dk

Follow-Ups:
- Re: [hobbit] Advice on how to handle HA monitoring
  - From: Asif Iqbal
- RE: [hobbit] Advice on how to handle HA monitoring
  - From: Haertig, David F (Dave)

References:
- Advice on how to handle HA monitoring
  - From: Charles Jones
- RE: [hobbit] Advice on how to handle HA monitoring
  - From: Haertig, David F (Dave)

Prev by Date: RE: [hobbit] Advice on how to handle HA monitoring
Next by Date: Re: [hobbit] Advice on how to handle HA monitoring
Previous by thread: RE: [hobbit] Advice on how to handle HA monitoring
Next by thread: RE: [hobbit] Advice on how to handle HA monitoring
Index(es):
- Date
- Thread