[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [hobbit] Logfile monitoring - I'd like some comments



Hi Henrik,

So on the central server there will be 2 configuration files. One for the log retrival defining interesting items ( I guess this is what today is yellow and red strings) and then a hobbit-client configuration file where you define the stings again ? I am not clear on why you would want to seperate files with some of the same information in. I get the idea of a log retrival configuration file with the log file names (both OS dependant and local files) but why not have it all in 1 file. Then you could have a LOGFILE-CLASS definition to put on a host group or single host in the hobbit-clients file.

Just a thought.

Will this new logfile retrival also be able to look for logfiles with variable file names, ie. logfile.txt-20060215 for today and then a new filename logfile.txt-20060216 tomorrow ? I know its stupid but that's how the vendor creates it.

Best regards,
Thomas

Henrik Stoerner wrote:
A few days ago, I mentioned that I would like to do logfile
monitoring for the next Hobbit release.

I've worked a bit on this and have a prototype solution for
it, which you can test with the current snapshots. I'd like
some comments on how it works to make sure I haven't overlooked
something before committing myself.

There are several objectives:
- As far as is possible, logfile monitoring must be configured
  centrally, on the Hobbit server. Having to go to each server
  to (re)configure what logfiles to check and what to look for
  simply doesn't work.
- The amount of data sent from each client to Hobbit should be
  small, but it must catch the "important" stuff.
- You rarely know in advance what will be in the logs when you
  need them the most. So the monitor should give you as much
  of the log entries as possible, not just those lines that
  match some pre-defined strings or regex'es.
- Some systems log messages on multiple lines. The system must
  be able to show all parts of a log entry.
- Logfile entries must appear on the monitor for some time after
  they show up in the logs, but should also disappear after a
  while.

In other words: The ideal solution would let you have the entire
logfile available on the Hobbit server - but that obviously won't work. So the client should - after weeding out the really irrelevant stuff - send us as much of each logfile as possible.


My proposed solution is this:
- On the Hobbit server, there's a log-monitoring configuration
file for the Hobbit clients. This defines which logfiles are
monitored for a single client installation, or you can define
it for a group of clients. (The idea is to define at least one group for each operating system, since the standard
system logs are OS dependant). This configuration lists the
log filename, the maximum amount of data to send from this
logfile, a regex "noise" filter (i.e. lines that are stripped
from the logfile), and *optionally* a regex identifying really
interesting stuff in the logfile that should always be reported.
- When a client connects to the Hobbit server and sends the
normal client message, the Hobbit server will respond with
the logfile configuration for this client. So the client
has a copy of the central configuration file, but only the
part that it needs for itself. The reason for sending this
as a response to the client message is to avoid an extra
round-trip from client to server; piggy-backing the config
push on the client message means that it is almost without
any performance cost on the server side.
- When the client runs, it uses the local copy of the configuration
file to determine what logs to look at. For each logfile, it
maintains a "where-was-I-the-last-time" status, so it only
looks at the entries made to the logfile during the past 30
minutes. First, the client strips off any "noise" messages.
Then, if all of the entries fit into the maximum size that
can be reported, it sends all of the log to the Hobbit server.
If there is more than will fit, it first checks to see of the
regex defining the really interesting stuff is present in the
log - if it is, then it drops anything before the interesting
text. If there is still more than will fit, it keeps the
interesting text + a few lines after that (to allow for
multi-line log-entries which some OS'es have), and then
sends that together with as much of last part the log as will
fit inside the max. message size.


This part has been implemented in the Hobbit daemon (hobbitd),
and in the clients via a new "logfetch" utility. This utility
uses standard regular expressions - not the Perl-compatible
ones, because that would require you to install the PCRE
library on all of your clients. The standard regex routines
are included in all (I think) system libraries used today.

The last part is what happens when the log data arrives on the
Hobbit server. Currently, there's a simple processing of this
data to just dump it into an always-green "msgs" column. What
should happen once I get it coded is:
- Data from each logfile is matched against a set of strings (regex'es) defined in the hobbit-clients.cfg file. Each string determines the color (red, yellow, green) and sets the color
of the msgs column accordingly.


When the color has been decided, all of the normal alerting
happens automatically. I do plan on making a more fine-grained
alert mechanism (for the msgs, procs and disk statuses) so you
can direct alerts to different groups depending on exactly which log-message triggered the alert, but that will not be
part of this release.



So - how does that sound ? Anything I've missed ?


Regards, Henrik


To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe (at) hswn.dk