[hobbit] Highlights of the 4.3.0 version

s_aiello at comcast.net s_aiello at comcast.net
Fri Aug 3 19:06:11 CEST 2007


On Friday 03 August 2007 11:38, Hubbard, Greg L wrote:
> Well, I use Netcool which has the opposite philosophy -- there is a
> "process automation" system that watches processes and restarts them if
> they fail, while also logging restarts.  You can configure a "restart"
> parameter to be anything from 0 (forever) to any number of times.  I
> like to set a reasonable number so persistent errors eventually kill the
> process, but occasional errors do not.  Log files are not overwritten,
> but are appended and rotated.
>
> But whatever.  My view seems to be in the minority -- guess the rest of
> you don't mind 24x7x365 babysitting.
>
> GLH
>

To restart a process, some form of intelligence has to be added to the restart 
script, especially when recovering from a failure mode. Scripts can only have 
so much intelligence, a restart script could be dangerous unless dealing with 
a simple situation.

Now after saying all this, I do have to admit I do have scripts that query the 
status of the monitoring server and on reds perform a restart. There should 
be nothing stopping you from implementing the same. It is just a very fine 
line when deciding when/how to implement process restarts.

Most times out of not, it is much better for a person to react to an alert 
then a script. But for recurring failure modes, these scripts do help and I 
don't get called at 3 am.

So if you really need to implement restart scripts, just use the bb tool's 
query feature.

 ~Steve



More information about the Xymon mailing list