[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [hobbit] Hobbit client executing a script to be proactive if a problem occurs?
- To: hobbit (at) hswn.dk
- Subject: Re: [hobbit] Hobbit client executing a script to be proactive if a problem occurs?
- From: Chris Wopat <chrisw (at) supranet.net>
- Date: Fri, 11 Apr 2008 12:17:24 -0500
- References: <47FF6DF2.4030306 (at) supranet.net> <258e9b160804110816v5e6dca48h6e430e43dc9fd97b (at) mail.gmail.com>
- User-agent: Mozilla-Thunderbird 2.0.0.12 (X11/20080406)
Phil Wild wrote:
Hi Chris,
I think it really depends on what you are testing? If you are using the
standard hobbit client and the standard tests, most of the client side
is pretty basic, I guess you could call it a dumb client in a way as it
does a simple job of pulling the data out and sends it on without any
intelligent decisions being made about thresholds etc.
To do what you want, you either have to do as you say (set up keys from
the server etc and have the server perform an action after a threshold
breach from a script initiated/configured in the hobbit-alerts file.
Bummer, I was hoping there was perhaps a barely documented feature that
would let you exec a script on the client to make my life easier.
Or, which would be much simpler, put some code in your monitoring script
to take action, but then you are starting to move away from the
simplicity of hobbit. It decomes even harder if you want to take action
based on something picked up in the standard tests (like CPU that you
mentioned in your post).
Indeed, my intent is for standard tests.
You may need to write your own test/new column
that monitors the same metric but in a different light. In my view, an
automated action based on a detected event probably does not belong in
the monitoring system. If a failure can be expected and an automated
action is known to fix the issue, perhaps that should be built into the
startup process of the application (a watchdog process etc).
Indeed, a daemon shouldn't fail and it should run properly or be fixed
natively. However, in the case of what I'm trying to monitor is an item
that has a series of dependencies - Postfix, depending on Amavis,
depending on p0f, depending on ClamAV, depending on greylist software,
depending on database, etc. Under certain circumstances if one of these
were to go down, it ends up snowballing to have high CPU, in my case.
The better way for me to handle this is to likely search logs for items,
instead of relying on high CPU.
Hobbit can
then be used to monitor the log for a restart event, or a failed restart
event etc. Actually, thinking about it more, building the intelligent
action into the agent is an ok idea and you also have the opportunity of
capturing and transmitting additional information about why something
dies if you run an action to fix and it failed etc.
For the scenario I laid out above, I intend to write a script that will
restart the daemons properly in the correct order, but this is the "oh
shit" script, and wouldn't be a system startup script, for example.
I am waffling... You
still need extra security based configuration steps on the client with
sudo or ssh anyway to get around access permission to restart something
anyway as your client is running as the hobbit userid so this brings the
client configuration closer to an ssh setup on the server. I don't think
either way is perfect but both would do what you want...
Indeed. It sounds like generally the two scenarios I'd mentioned in my
email are the way to get it to work, and whichever is most reliable/less
hack-ish would be the best way to do it.
Perhaps this is something that belongs on a request for feature list of
a future release of hobbit.
The hobbit client installation to configure sudo to allow it to run
commands as other users (on admins acceptance during the installation of
course).
The ability of the hobbit server to send a series of actions to the
hobbit client for execution via the hobbit communication channel. Sounds
like something that could have lots of uses if done well...
I think this would be the perfect scenario. Something added to the
hobbit client, that would go in 'localclient.cfg'. A simple 'SCRIPT'
line that could be nested under a test, that would pass along whatever
meaningful variables that could be useful, such as PID.
The hobbit server *already uses* 'setuid root' for some binaries, such
as 'hobbitping'. The client would simply need to call some binary who's
sole purpose is to launch scripts as root so essentially anything would
be possible.
I can't think of any reason that some hook would have to exist for the
script to tell the hobbit client anything back, I think it can just wait
for the next poll period to see if it went back to green.
--Chris