[Xymon] alerting on log messages

Colin Coe colin.coe at gmail.com
Wed Sep 18 10:11:52 CEST 2019


Hi Jeremy

Many thanks for this, working perfectly.

Thanks again

On Wed, Sep 18, 2019 at 2:58 PM Jeremy Laidman <jeremy at laidman.org> wrote:

> Ah, I worked it out. In client-local.cfg, the patterns after "ignore" and
> "trigger" are defined as "regular expression". This is unlike (say)
> analysis.cfg where patterns HOST and LOG keywords are defined as "string or
> regular expression" and the "%" signifies a regular expression. Also, in
> client-local.cfg, strings with spaces must be enclosed in quotes, wherease
> in client-local.cfg, everything after the keyword (eg ignore, trigger) is
> treated as the regular expression. So in client-local.cfg you must not
> include a % or quotes. Instead you want something like:
>
> [host=test_server_41]
> log:/var/log/messages:1024000
> ignore (Failed to fetch|Failed to parse|Failed to evaluate)
> trigger (ORA-04091|Failed to log off resource|Failed to log on resource)
>
> It can be very painful to troubleshoot problems with client-local.cfg
> configs, especially as it can take up to 10 minutes for updates to
> propagate to clients and generate new results. I like to create my own copy
> of the logfile and the client-local.cfg snippet, and manually run logfetch
> (which is what processes the client-local.cfg lines). For example, this is
> what I used to diagnose Colin's problem:
>
> xymon at server:/tmp/logfetch-test>* ls -l*
> total 8
> -rw-r--r-- 1 xymon xymon 160 2019-09-18 16:49 my-client-local.cfg
> -rw-r--r-- 1 xymon xymon 273 2019-09-18 16:34 my-logfile.log
>
> xymon at server:/tmp/logfetch-test> *cat my-logfile.log*
> log line 1 ignore Failed to fetch bla bla
> log line 2 ignore Failed to parse bla bla
> log line 3 ignore Failed to evaluate bla bla
> log line 4 trigger ORA-04091 bla bla
> log line 5 trigger Failed to log off resource bla bla
> log line 6 trigger Failed to log on resource bla bla
>
> xymon at server:/tmp/logfetch-test> *cat my-client-local.cfg*
> log:my-logfile.log:1024000
> ignore (Failed to fetch|Failed to parse|Failed to evaluate)
> trigger (ORA-04091|Failed to log off resource|Failed to log on resource)
>
> xymon at server:/tmp/logfetch-test> *$XYMONCLIENTHOME/bin/logfetch
> my-client-local.cfg /dev/null*
> [msgs:my-logfile.log]
> log line 4 trigger ORA-04091 bla bla
> log line 5 trigger Failed to log off resource bla bla
> log line 6 trigger Failed to log on resource bla bla
>
> [logfile:my-logfile.log]
> type:100000 (file)
> mode:644 (-rw-r--r--)
> linkcount:1
> owner:1984 (xymon)
> group:1984 (xymon)
> size:273
> clock:1568789652 (2019/09/18-16:54:12)
> atime:1568789652 (2019/09/18-16:54:12)
> ctime:1568788625 (2019/09/18-16:37:05)
> mtime:1568788446 (2019/09/18-16:34:06)
>
> Cheers
> Jeremy
>
>
> On Wed, 18 Sep 2019 at 16:18, Colin Coe <colin.coe at gmail.com> wrote:
>
>> Hi Jeremy
>>
>> What I'm finding is that I'm getting SMS alerts from Xymon about lines in
>> /var/log/messages such as "Failed to evaluate"," Failed to fetch", and
>> "Failed to calculate" which are normal for the application we're running.
>> I only want SMS alerts about the Oracle error and failed to log o/off
>> resource.
>>
>> Thanks
>>
>> On Wed, Sep 18, 2019 at 1:37 PM Jeremy Laidman <jeremy at laidman.org>
>> wrote:
>>
>>> "After picking out the "trigger" lines, any remaining space up to the
>>> maximum size is filled in with the most recent entries from the logfile."
>>>
>>> So it will include what you request as well as whatever else that will
>>> fit.
>>>
>>>
>>> On Wed, 18 Sep 2019 at 09:32, Colin Coe <colin.coe at gmail.com> wrote:
>>>
>>>> Hi all
>>>>
>>>> I have this entry in client-local.cfg
>>>> ---
>>>> [host=test_server_41]
>>>> log:/var/log/messages:1024000
>>>> ignore "%(Failed to fetch|Failed to parse|Failed to evaluate)"
>>>> trigger "%(ORA-04091|Failed to log off resource|Failed to log on
>>>> resource)"
>>>> ---
>>>>
>>>> And in analysis.cfg I have:
>>>> ---
>>>> HOST= test_server_41
>>>>     LOG /var/log/messages %(ORA-04091|Failed to log off resource|Failed
>>>> to log on resource) IGNORE=OCS color=red
>>>> ---
>>>>
>>>> Needless to say I'm getting alerts for more than just "ORA-04091",
>>>> "Failed to log off resource", and "Failed to log on resource".
>>>>
>>>> Any ideas what I'm doing wrong?
>>>>
>>>> Thanks
>>>>
>>>> _______________________________________________
>>>> Xymon mailing list
>>>> Xymon at xymon.com
>>>> http://lists.xymon.com/mailman/listinfo/xymon
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20190918/ddfe8888/attachment.htm>


More information about the Xymon mailing list