[hobbit] bbfix functionality in hobbit

Kauffman, Tom KauffmanT at nibco.com
Fri Jul 21 19:25:33 CEST 2006


Yup.

There's always the case that the process died for a good reason and it's
going to die again as soon as it restarts. If you don't set up the
restart/recovery routine with a bit of logic to quit after a while you
can end up with a box that's spending all its cycles just restarting,
dumping, and dying.

I learned that one the hard way :-)

Tom Kauffman
NIBCO, Inc

-----Original Message-----
From: Aiello, Steve (GE, Corporate, consultant)
[mailto:steve.aiello at ge.com] 
Sent: Friday, July 21, 2006 12:11 PM
To: hobbit at hswn.dk
Subject: RE: [hobbit] bbfix functionality in hobbit

I agree completely with Henrik. Adding 'intelligence' into a script is
rather difficult. But I do have a rather 'painful' IIS ASP application.
And frequently IIS will die, hang, stop processing ASP. So I wrote a
script that runs on each IIS server, that queries the HTTP status from
the monitoring server and if the status is red all dllhost & inetinfo
processes are killed & the IIS is started. My fear was that this restart
script could be stuck in a loop, i.e. possibility that IIS can not be
started. So I added the intelligence into my script that logs the date &
time of each time the script restarts IIS. It will only restart, if it
has not done it more than 3 times in the last hour. Lately I have been
thinking of adding in more logic to check if a restart occured in the
last 5 minutes (poling period). Becuase I have seen the restart script
do it's job and fix IIS, but the monitoring server has not checked yet.
Thus the restart script bounces IIS again...

> -----Original Message-----
> From: Henrik Stoerner [mailto:henrik at hswn.dk] 
> Sent: Friday, July 21, 2006 11:07 AM
> To: hobbit at hswn.dk
> Subject: Re: [hobbit] bbfix functionality in hobbit
> 
> 
> On Fri, Jul 21, 2006 at 10:21:40AM -0400, Gary B. wrote:
> > Is there any method with hobbit to have the client automatically 
> > restart services, like bbfix does for BB?  We would like to 
> be able to 
> > restart services that tend to fail often (such as SSH tunnels) 
> > automatically through Hobbit.  Without writing a custom external 
> > script, I can't seem to find any information about doing this.
> 
> You will need to do some scripting, no doubt about that.
> 
> Whether it's a good idea or not ... it depends. From the 
> Hobbit "design" perspective (whoa - that sounds expensive) I 
> have a very firm belief that Hobbit should *monitor* things, 
> not *fix* them. I have seen far too many "intelligent" 
> systems get in the way of real problem-fixing because 
> "intelligent" systems are usually pretty dumb, and cannot 
> handle anything out of the ordinary. When they try, they 
> often fail in spectacular ways.
> 
> And having things happen behind your back - because you 
> forgot about that little automatic script someone else setup 
> 2 years ago - is just plain frustrating.
> 
> With that little sermon as introduction, here's what you can 
> do. On the host(s) where you want to restart these services, 
> write a script to query the Hobbit server for the status of 
> the service. If it's red, do the restart. You can use the 
> Hobbit "query" command to tell what status the service has. 
> E.g. if you want to reset the SSH tunnels when the "tunnels" 
> status goes red, then this little script run from the Hobbit 
> client's clientlaunch.cfg would do it:
> 
>    #!/bin/sh
> 
>    TUNNELSTATUS=`$BB $BBDISP "query $MACHINE.tunnels"|awk 
> '{print $1}'`
>    if test "$TUNNELSTATUS" = "red"; then
>       sudo /etc/init.d/sshtunnels stop
>       sleep 5
>       sudo /etc/init.d/sshtunnels start
>       echo "`date`: SSH tunnels restarted"
>    fi
> 
>    exit 0
> 
> 
> Regards,
> Henrik
> 
> 
> To unsubscribe from the hobbit list, send an e-mail to 
> hobbit-unsubscribe at hswn.dk
> 
> 
> 

To unsubscribe from the hobbit list, send an e-mail to
hobbit-unsubscribe at hswn.dk


CONFIDENTIALITY NOTICE:  This email and any attachments are for the 
exclusive and confidential use of the intended recipient.  If you are not
the intended recipient, please do not read, distribute or take action in 
reliance upon this message. If you have received this in error, please 
notify us immediately by return email and promptly delete this message 
and its attachments from your computer system. We do not waive  
attorney-client or work product privilege by the transmission of this
message.




More information about the Xymon mailing list