[Xymon] xymongen crashes in 4.3.29
Japheth Cleaver
cleaver at terabithia.org
Thu Aug 22 23:11:59 CEST 2019
Hi,
I think this might be xymongen in report mode from the "dailyreport"
file in /tasks.d/; the timing would check out. I believe the problem
here is one of the Terabithia patches now doing the wrong thing after
some of the string-handling changes in 4.3.29 -- causing core dumps in
certain situations.
If you're running actual RHEL7 on this (not CentOS, which hasn't
released 7.7 yet), would you mind checking the xymon-4.3.30-0.5 package
in the EL7 Terabithia testing repo and see if this helps?
https://repo.terabithia.org/rpms/xymon/testing/el7/x86_64/
Regards,
-jc
On 8/22/2019 11:34 AM, Matt Vander Werf wrote:
> Hi Torsten,
>
> No, there wasn't anything running from cron or anything else around
> that time, let alone anything that restarts the network or Xymon.
>
> Thanks.
>
> --
> Matt Vander Werf
>
>
> On Wed, Aug 21, 2019 at 5:43 AM Torsten Richter <bb4 at richter-it.net
> <mailto:bb4 at richter-it.net>> wrote:
>
> Hi Matt,
>
> dumb question: is there any cron job running at this time that is
> restarting XYmon fiddling with the network, like restarting the
> network for some reason?
>
> Regards,
> Torsten
>
>> Matt Vander Werf <matt1299 at gmail.com <mailto:matt1299 at gmail.com>>
>> hat am 20. August 2019 um 17:10 geschrieben:
>>
>> Hi all,
>>
>> Every day since we updated our Xymon server to 4.3.29 (from
>> 4.3.28), I've gotten an e-mail alert due to xymond turning red
>> that reads:
>>
>> red xymongen program crashed
>>
>> Fatal signal caught!
>>
>> The strange thing is that this has happened at 1:04 AM every
>> day...like clockwork. I have xymongen set to run every 1 minute
>> and it has no problems running any other time of the day. We are
>> using the Terabithia RPMs and the Xymon server is running RHEL 7.
>>
>> I've scoured the system to find anything that is set to run
>> at/around that time via cron, etc. and haven't found anything.
>> The system logs don't show anything is happening around that time
>> either.
>>
>> I turned on debug logging for xymond and xymongen and haven't
>> been able to find anything unusual in either logs around that
>> time. But it is dumping core files for xymongen every time it
>> crashes.
>>
>> I used gdb to get the backtrace on all of the core files (so far)
>> and I've found that they all show the same thing. It shows the
>> same host in the backtrace too (although I'm farily confident it
>> isn't specific or isolated to that host but just the first one it
>> runs into that it has issues with when processing).
>>
>> I've included an example gdb output below (the most recent one) [1].
>>
>> Is anyone else running into this by chance? Or any idea what
>> might be the cause?
>>
>> Thanks!
>>
>>
>> [1]
>> # gdb -q /usr/libexec/xymon/xymongen core.16327
>> Reading symbols from /usr/libexec/xymon/xymongen...Reading
>> symbols from /usr/lib/debug/usr/libexec/xymon/xymongen.debug...done.
>> done.
>> [New LWP 16327]
>> [Thread debugging using libthread_db enabled]
>> Using host libthread_db library "/lib64/libthread_db.so.1".
>> Core was generated by `/usr/libexec/xymon/xymongen
>> --reportopts=1566187200:1566273599:0:nongr --recent'.
>> Program terminated with signal 6, Aborted.
>> #0 0x00007f4657c49377 in __GI_raise (sig=sig at entry=6) at
>> ../nptl/sysdeps/unix/sysv/linux/raise.c:55
>> 55 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
>> (gdb) bt
>> #0 0x00007f4657c49377 in __GI_raise (sig=sig at entry=6) at
>> ../nptl/sysdeps/unix/sysv/linux/raise.c:55
>> #1 0x00007f4657c4aa68 in __GI_abort () at abort.c:90
>> #2 0x00005589375dd455 in sigsegv_handler (signum=<optimized
>> out>) at sig.c:57
>> #3 <signal handler called>
>> #4 strchrnul () at ../sysdeps/x86_64/strchrnul.S:33
>> #5 0x00007f4657c5b681 in __find_specmb (format=0xfce <Address
>> 0xfce out of bounds>) at printf-parse.h:109
>> #6 _IO_vfprintf_internal (s=s at entry=0x7ffd5dabcc00,
>> format=format at entry=0xfce <Address 0xfce out of bounds>,
>> ap=ap at entry=0x7ffd5dabcd38) at vfprintf.c:1308
>> #7 0x00007f4657d28c78 in ___vsprintf_chk (s=0x7ffd5dabcf82 "",
>> flags=1, slen=18446744073709551615,
>> format=0xfce <Address 0xfce out of bounds>,
>> args=args at entry=0x7ffd5dabcd38) at vsprintf_chk.c:83
>> #8 0x00007f4657d28bcd in ___sprintf_chk (s=<optimized out>,
>> flags=flags at entry=1,
>> slen=slen at entry=18446744073709551615, format=<optimized out>)
>> at sprintf_chk.c:32
>> #9 0x00005589375ce8ca in sprintf (__fmt=<optimized out>,
>> __s=<optimized out>)
>> at /usr/include/bits/stdio2.h:33
>> #10 parse_histlogfile (starttime=1566187200,
>> timespec=0x558937840f50 <timespec.7157>
>> "Wed_Sep_2_19:34:55_2015", servicename=0x5589383b6d70 "procs",
>> hostname=0x558938a335d0 "<client hostname>") at
>> availability.c:174
>> #11 parse_historyfile (fd=fd at entry=0x558938a3aea0,
>> repinfo=<optimized out>,
>> hostname=0x558938a335d0 "<client hostname>",
>> servicename=0x5589383b6d70 "procs",
>> fromtime=<optimized out>, totime=1566273599,
>> for_history=for_history at entry=0, warnlevel=97,
>> greenlevel=99.995000000000005, warnstops=-1, reporttime=0x0)
>> at availability.c:475
>> #12 0x00005589375c38cc in init_state (filename=<optimized out>,
>> filename at entry=0x7ffd5dacf210 "<client hostname>.procs",
>> log=log at entry=0x7ffd5dacf120)
>> at loaddata.c:275
>> #13 0x00005589375c45ee in load_state
>> (sumhead=sumhead at entry=0x558937809d48 <dispsums>) at loaddata.c:626
>> #14 0x00005589375be6f4 in main (argc=5, argv=0x7ffd5dad4418) at
>> xymongen.c:599
>>
>>
>> --
>> Matt Vander Werf
>> _______________________________________________
>> Xymon mailing list
>> Xymon at xymon.com <mailto:Xymon at xymon.com>
>> http://lists.xymon.com/mailman/listinfo/xymon
>
>
> _______________________________________________
> Xymon mailing list
> Xymon at xymon.com
> http://lists.xymon.com/mailman/listinfo/xymon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20190822/6f689685/attachment.htm>
More information about the Xymon
mailing list