[Xymon] xymongen crashes in 4.3.29

Matt Vander Werf matt1299 at gmail.com
Tue Aug 20 17:10:53 CEST 2019


Hi all,

Every day since we updated our Xymon server to 4.3.29 (from 4.3.28), I've
gotten an e-mail alert due to xymond turning red that reads:

red xymongen program crashed

Fatal signal caught!

The strange thing is that this has happened at 1:04 AM every day...like
clockwork. I have xymongen set to run every 1 minute and it has no problems
running any other time of the day. We are using the Terabithia RPMs and the
Xymon server is running RHEL 7.

I've scoured the system to find anything that is set to run at/around that
time via cron, etc. and haven't found anything. The system logs don't show
anything is happening around that time either.

I turned on debug logging for xymond and xymongen and haven't been able to
find anything unusual in either logs around that time. But it is dumping
core files for xymongen every time it crashes.

I used gdb to get the backtrace on all of the core files (so far) and I've
found that they all show the same thing. It shows the same host in the
backtrace too (although I'm farily confident it isn't specific or isolated
to that host but just the first one it runs into that it has issues with
when processing).

I've included an example gdb output below (the most recent one) [1].

Is anyone else running into this by chance? Or any idea what might be the
cause?

Thanks!


[1]
# gdb -q /usr/libexec/xymon/xymongen core.16327
Reading symbols from /usr/libexec/xymon/xymongen...Reading symbols from
/usr/lib/debug/usr/libexec/xymon/xymongen.debug...done.
done.
[New LWP 16327]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/libexec/xymon/xymongen
--reportopts=1566187200:1566273599:0:nongr --recent'.
Program terminated with signal 6, Aborted.
#0  0x00007f4657c49377 in __GI_raise (sig=sig at entry=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:55
55  return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) bt
#0  0x00007f4657c49377 in __GI_raise (sig=sig at entry=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:55
#1  0x00007f4657c4aa68 in __GI_abort () at abort.c:90
#2  0x00005589375dd455 in sigsegv_handler (signum=<optimized out>) at
sig.c:57
#3  <signal handler called>
#4  strchrnul () at ../sysdeps/x86_64/strchrnul.S:33
#5  0x00007f4657c5b681 in __find_specmb (format=0xfce <Address 0xfce out of
bounds>) at printf-parse.h:109
#6  _IO_vfprintf_internal (s=s at entry=0x7ffd5dabcc00,
    format=format at entry=0xfce <Address 0xfce out of bounds>,
ap=ap at entry=0x7ffd5dabcd38)
at vfprintf.c:1308
#7  0x00007f4657d28c78 in ___vsprintf_chk (s=0x7ffd5dabcf82 "", flags=1,
slen=18446744073709551615,
    format=0xfce <Address 0xfce out of bounds>, args=args at entry=0x7ffd5dabcd38)
at vsprintf_chk.c:83
#8  0x00007f4657d28bcd in ___sprintf_chk (s=<optimized out>,
flags=flags at entry=1,
    slen=slen at entry=18446744073709551615, format=<optimized out>) at
sprintf_chk.c:32
#9  0x00005589375ce8ca in sprintf (__fmt=<optimized out>, __s=<optimized
out>)
    at /usr/include/bits/stdio2.h:33
#10 parse_histlogfile (starttime=1566187200,
    timespec=0x558937840f50 <timespec.7157> "Wed_Sep_2_19:34:55_2015",
servicename=0x5589383b6d70 "procs",
    hostname=0x558938a335d0 "<client hostname>") at availability.c:174
#11 parse_historyfile (fd=fd at entry=0x558938a3aea0, repinfo=<optimized out>,
    hostname=0x558938a335d0 "<client hostname>", servicename=0x5589383b6d70
"procs",
    fromtime=<optimized out>, totime=1566273599,
for_history=for_history at entry=0, warnlevel=97,
    greenlevel=99.995000000000005, warnstops=-1, reporttime=0x0) at
availability.c:475
#12 0x00005589375c38cc in init_state (filename=<optimized out>,
    filename at entry=0x7ffd5dacf210 "<client hostname>.procs", log=log at entry
=0x7ffd5dacf120)
    at loaddata.c:275
#13 0x00005589375c45ee in load_state (sumhead=sumhead at entry=0x558937809d48
<dispsums>) at loaddata.c:626
#14 0x00005589375be6f4 in main (argc=5, argv=0x7ffd5dad4418) at
xymongen.c:599


-- 
Matt Vander Werf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20190820/58b4cc03/attachment.htm>


More information about the Xymon mailing list