[Xymon] xymongen crashes in 4.3.29

Japheth Cleaver cleaver at terabithia.org
Thu Aug 22 23:11:59 CEST 2019


Hi,

I think this might be xymongen in report mode from the "dailyreport" 
file in /tasks.d/; the timing would check out.  I believe the problem 
here is one of the Terabithia patches now doing the wrong thing after 
some of the string-handling changes in 4.3.29 -- causing core dumps in 
certain situations.

If you're running actual RHEL7 on this (not CentOS, which hasn't 
released 7.7 yet), would you mind checking the xymon-4.3.30-0.5 package 
in the EL7 Terabithia testing repo and see if this helps?
https://repo.terabithia.org/rpms/xymon/testing/el7/x86_64/

Regards,
-jc

On 8/22/2019 11:34 AM, Matt Vander Werf wrote:
> Hi Torsten,
>
> No, there wasn't anything running from cron or anything else around 
> that time, let alone anything that restarts the network or Xymon.
>
> Thanks.
>
> --
> Matt Vander Werf
>
>
> On Wed, Aug 21, 2019 at 5:43 AM Torsten Richter <bb4 at richter-it.net 
> <mailto:bb4 at richter-it.net>> wrote:
>
>     Hi Matt,
>
>     dumb question: is there any cron job running at this time that is
>     restarting XYmon fiddling with the network, like restarting the
>     network for some reason?
>
>     Regards,
>     Torsten
>
>>     Matt Vander Werf <matt1299 at gmail.com <mailto:matt1299 at gmail.com>>
>>     hat am 20. August 2019 um 17:10 geschrieben:
>>
>>     Hi all,
>>
>>     Every day since we updated our Xymon server to 4.3.29 (from
>>     4.3.28), I've gotten an e-mail alert due to xymond turning red
>>     that reads:
>>
>>     red xymongen program crashed
>>
>>     Fatal signal caught!
>>
>>     The strange thing is that this has happened at 1:04 AM every
>>     day...like clockwork. I have xymongen set to run every 1 minute
>>     and it has no problems running any other time of the day. We are
>>     using the Terabithia RPMs and the Xymon server is running RHEL 7.
>>
>>     I've scoured the system to find anything that is set to run
>>     at/around that time via cron, etc. and haven't found anything.
>>     The system logs don't show anything is happening around that time
>>     either.
>>
>>     I turned on debug logging for xymond and xymongen and haven't
>>     been able to find anything unusual in either logs around that
>>     time. But it is dumping core files for xymongen every time it
>>     crashes.
>>
>>     I used gdb to get the backtrace on all of the core files (so far)
>>     and I've found that they all show the same thing. It shows the
>>     same host in the backtrace too (although I'm farily confident it
>>     isn't specific or isolated to that host but just the first one it
>>     runs into that it has issues with when processing).
>>
>>     I've included an example gdb output below (the most recent one) [1].
>>
>>     Is anyone else running into this by chance? Or any idea what
>>     might be the cause?
>>
>>     Thanks!
>>
>>
>>     [1]
>>     # gdb -q /usr/libexec/xymon/xymongen core.16327
>>     Reading symbols from /usr/libexec/xymon/xymongen...Reading
>>     symbols from /usr/lib/debug/usr/libexec/xymon/xymongen.debug...done.
>>     done.
>>     [New LWP 16327]
>>     [Thread debugging using libthread_db enabled]
>>     Using host libthread_db library "/lib64/libthread_db.so.1".
>>     Core was generated by `/usr/libexec/xymon/xymongen
>>     --reportopts=1566187200:1566273599:0:nongr --recent'.
>>     Program terminated with signal 6, Aborted.
>>     #0  0x00007f4657c49377 in __GI_raise (sig=sig at entry=6) at
>>     ../nptl/sysdeps/unix/sysv/linux/raise.c:55
>>     55  return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
>>     (gdb) bt
>>     #0  0x00007f4657c49377 in __GI_raise (sig=sig at entry=6) at
>>     ../nptl/sysdeps/unix/sysv/linux/raise.c:55
>>     #1  0x00007f4657c4aa68 in __GI_abort () at abort.c:90
>>     #2  0x00005589375dd455 in sigsegv_handler (signum=<optimized
>>     out>) at sig.c:57
>>     #3  <signal handler called>
>>     #4  strchrnul () at ../sysdeps/x86_64/strchrnul.S:33
>>     #5  0x00007f4657c5b681 in __find_specmb (format=0xfce <Address
>>     0xfce out of bounds>) at printf-parse.h:109
>>     #6  _IO_vfprintf_internal (s=s at entry=0x7ffd5dabcc00,
>>         format=format at entry=0xfce <Address 0xfce out of bounds>,
>>     ap=ap at entry=0x7ffd5dabcd38) at vfprintf.c:1308
>>     #7  0x00007f4657d28c78 in ___vsprintf_chk (s=0x7ffd5dabcf82 "",
>>     flags=1, slen=18446744073709551615,
>>         format=0xfce <Address 0xfce out of bounds>,
>>     args=args at entry=0x7ffd5dabcd38) at vsprintf_chk.c:83
>>     #8  0x00007f4657d28bcd in ___sprintf_chk (s=<optimized out>,
>>     flags=flags at entry=1,
>>         slen=slen at entry=18446744073709551615, format=<optimized out>)
>>     at sprintf_chk.c:32
>>     #9  0x00005589375ce8ca in sprintf (__fmt=<optimized out>,
>>     __s=<optimized out>)
>>         at /usr/include/bits/stdio2.h:33
>>     #10 parse_histlogfile (starttime=1566187200,
>>         timespec=0x558937840f50 <timespec.7157>
>>     "Wed_Sep_2_19:34:55_2015", servicename=0x5589383b6d70 "procs",
>>         hostname=0x558938a335d0 "<client hostname>") at
>>     availability.c:174
>>     #11 parse_historyfile (fd=fd at entry=0x558938a3aea0,
>>     repinfo=<optimized out>,
>>         hostname=0x558938a335d0 "<client hostname>",
>>     servicename=0x5589383b6d70 "procs",
>>         fromtime=<optimized out>, totime=1566273599,
>>     for_history=for_history at entry=0, warnlevel=97,
>>         greenlevel=99.995000000000005, warnstops=-1, reporttime=0x0)
>>     at availability.c:475
>>     #12 0x00005589375c38cc in init_state (filename=<optimized out>,
>>         filename at entry=0x7ffd5dacf210 "<client hostname>.procs",
>>     log=log at entry=0x7ffd5dacf120)
>>         at loaddata.c:275
>>     #13 0x00005589375c45ee in load_state
>>     (sumhead=sumhead at entry=0x558937809d48 <dispsums>) at loaddata.c:626
>>     #14 0x00005589375be6f4 in main (argc=5, argv=0x7ffd5dad4418) at
>>     xymongen.c:599
>>
>>
>>     -- 
>>     Matt Vander Werf
>>     _______________________________________________
>>     Xymon mailing list
>>     Xymon at xymon.com <mailto:Xymon at xymon.com>
>>     http://lists.xymon.com/mailman/listinfo/xymon
>
>
> _______________________________________________
> Xymon mailing list
> Xymon at xymon.com
> http://lists.xymon.com/mailman/listinfo/xymon


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20190822/6f689685/attachment.htm>


More information about the Xymon mailing list