[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [hobbit] Bug Report: Critical error in log couldn't be sent to server every time



Samuel,

Maybe the current release of Hobbit is not up to this task (maybe you
should ask for a refund :) )?  I think the Hobbit logfetch function is
aimed more at "convenience monitoring" instead of real-time log
filtering. It is not hard to envision cases where processing log files
in "30 minute chunks" might have scalability problems.

If these messages are VERY important, you might search the Web for a
tool that will scan a log file watching for these messages, and then
write them to another log, and then have the Hobbit agent watch the log
you create that only has "interesting" messages in it.

GLH

-----Original Message-----
From: Samuel Cai [mailto:Samuel.Cai (at) ehealth-china.com] 
Sent: Wednesday, July 23, 2008 5:22 AM
To: hobbit (at) hswn.dk
Subject: [hobbit] Bug Report: Critical error in log couldn't be sent to
server every time

Hi,

My company is using Hobbit (4.2) to monitor OutOfMemoryError and
StackOverflowError in application log, but we found out sometimes Hobbit
client did not send data which contains these error strings to server,
that resulted in no error reported.
Below is our configuration snippet in client-local.cfg, as you can see,
although we set maximum amount of data to 10240 bytes, we also set
trigger on key word of Error, so even if there is more data in the log
than the maximum size set, those matched error string should be sent to
server in any case:
[our server]
log:/home/mine/server.log:10240
trigger Error

So I'm thinking two possible reasons:
1. The regular expression for trigger is wrong.
2. There's a bug/limitation in logfetch tool, it can only process a
maximum data, for example, if application happened to write 100M data to
log file in 5 mins, this tool will only process, say last 10M data.

I made some tests to find out root reason, each test contains two steps:
1. Clean log, wait after client sends out data.
2. Fill in some data into log, the first line is "OutOfMemoryError
StackOverflowError", others are just garbage data.

Here is the result, I list the lines (L) and bytes (C) of log after
filled in data:
1. 485L, 54545C, catch error
2. 1445L, 163025C, couldn't catch error
3. 707L, 53771C, couldn't catch error
4. 468L, 36451C, couldn't catch error
5. 226L, 18615C, catch error

The test proves that the trigger pattern is correct, and logfetch tool
has an issue to process all new data if it's large (in lines or in
bytes, I don't know).

We need to fix it or have a workaround, since these errors are so
important, we shouldn't miss them.

Thanks,
Samuel Cai

To unsubscribe from the hobbit list, send an e-mail to
hobbit-unsubscribe (at) hswn.dk