[Xymon] logfetch issue - sending old data causing false alerts
Jeremy Laidman
jlaidman at rebel-it.com.au
Wed Jun 21 23:05:15 CEST 2017
How is the log file being created and written to? When and how is it
rotated? The logfetch output indicates that that file shrunk in size twice;
is that possible?
J
On 22 Jun. 2017 05:20, "Max Xu" <Max.Xu at servicenow.com> wrote:
> Hi,
>
> Logfetch is sending old data causing false alerts.
>
> The log file looks somewhat like this:
> Error 2017-06-14 11:36:58.613343 39915 2184308576 Compare server: ……
> Error 2017-06-14 11:36:58.613481 39913 1581872992 Command server: ……
> …… (note. The above repeat about 780K times)
> Info 2017-06-14 13:07:41.113163 1193 1036199776 Compare server exited
> normally, pid = 45494 [sp_desvr]
> …..
> Error 2017-06-15 02:42:22.820068 1761 2399766368 Command server:…..
> ……
>
>
> At 6/19 and 6/20, msgs alert generated with all the old data of 6/14 and
> 6/15 etc. below is sniper of alert on 6/19
> Mon Jun 19 17:48:57 PDT 2017 - Log files NOT ok
>
> [image: red] Critical entries in /u01/shareplex/var/log/event_log <https://monitor01.lhr9.service-now.com/xymon-cgi/svcstatus.sh?CLIENT=ora164106.sjc4.service-now.com&SECTION=msgs:/u01/shareplex/var/log/event_log>[image: red] Error 2017-06-14 12:07:24.545252 9795 1581102944 Command server: ReconcileLog: failed to construct object-cache: Illegal state: Item 372354 already in the object id registry (connecting from ora164106.sjc4.service-now.com) [module osp][image: red] Error 2017-06-14 12:07:24.545499 9795 1581102944 Command server: ReconcileLog: failed to construct object-cache: Illegal state: Item 372356 already in the object id registry (connecting from ora164106.sjc4.service-now.com) [module osp]
>
>
>
> Meantime, see xymonclient.log:
> 2017-06-19 17:49:01.428381 logfetch: File /u01/shareplex/var/log/event_log
> shrank from >=173538314 to 48414720 bytes in size. Probably rotated;
> clearing position state
> 2017-06-19 17:49:01.428462 logfetch: /u01/shareplex/var/log/event_log
> delta 48414720 bytes exceeds max buffer size 10485760; skipping some data
> 2017-06-19 17:51:05.086815 logfetch: /u01/shareplex/var/log/event_log
> delta 173538314 bytes exceeds max buffer size 10485760; skipping some data
> 2017-06-19 17:53:09.134469 <13%2044%2069> logfetch:
> /u01/shareplex/var/log/event_log delta 173538314 bytes exceeds max buffer
> size 10485760; skipping some data
> 2017-06-19 17:55:12.647682 logfetch: /u01/shareplex/var/log/event_log
> delta 173538314 bytes exceeds max buffer size 10485760; skipping some data
> 2017-06-19 17:57:16.163913 logfetch: /u01/shareplex/var/log/event_log
> delta 173538314 bytes exceeds max buffer size 10485760; skipping some data
> 2017-06-19 17:59:19.662801 logfetch: /u01/shareplex/var/log/event_log
> delta 173538314 bytes exceeds max buffer size 10485760; skipping some data
> 2017-06-19 18:01:23.180499 logfetch: /u01/shareplex/var/log/event_log
> delta 173538453 bytes exceeds max buffer size 10485760; skipping some data
> 2017-06-19 18:03:26.777636 logfetch: /u01/shareplex/var/log/event_log
> delta 125123733 bytes exceeds max buffer size 10485760; skipping some data
> 2017-06-20 06:42:01.519481 logfetch: File /u01/shareplex/var/log/event_log
> shrank from >=173541482 to 74420224 bytes in size. Probably rotated;
> clearing position state
> 2017-06-20 06:42:01.519557 logfetch: /u01/shareplex/var/log/event_log
> delta 74420224 bytes exceeds max buffer size 10485760; skipping some data
> 2017-06-20 06:44:05.173606 logfetch: /u01/shareplex/var/log/event_log
> delta 173541633 bytes exceeds max buffer size 10485760; skipping some data
> 2017-06-20 06:46:08.670466 logfetch: /u01/shareplex/var/log/event_log
> delta 173541633 bytes exceeds max buffer size 10485760; skipping some data
> 2017-06-20 06:48:12.188216 logfetch: /u01/shareplex/var/log/event_log
> delta 173541633 bytes exceeds max buffer size 10485760; skipping some data
> 2017-06-20 06:50:15.683455 logfetch: /u01/shareplex/var/log/event_log
> delta 173541633 bytes exceeds max buffer size 10485760; skipping some data
> 2017-06-20 06:52:19.250727 logfetch: /u01/shareplex/var/log/event_log
> delta 173541633 bytes exceeds max buffer size 10485760; skipping some data
> 2017-06-20 06:54:22.752463 logfetch: /u01/shareplex/var/log/event_log
> delta 173541633 bytes exceeds max buffer size 10485760; skipping some data
> 2017-06-20 06:56:23.426678 logfetch: /u01/shareplex/var/log/event_log
> delta 99121409 bytes exceeds max buffer size 10485760; skipping some data
>
> Noted.
>
> 1. The above 2m interval is my setup of xymon client.
> 2. It seems the logfetch status file is not successfully saved and
> source code shows no error check (so no direct evidence).
> 3. The behavior only last under 20 min. The server itself did not have
> disk and cpu alerts and no one report any issues related to disk and io.
>
>
> I was told that this behavior is not new although rarely happen. Is there
> any solution or work round?
>
> My running version is:
> Xymon version 4.3.25-1.el6.terabithia
>
>
> Thanks,
> -max
>
>
> _______________________________________________
> Xymon mailing list
> Xymon at xymon.com
> http://lists.xymon.com/mailman/listinfo/xymon
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20170622/e4c1c989/attachment.html>
More information about the Xymon
mailing list