<div dir="ltr"><div dir="ltr"><div>Hi JC,</div><div><br></div><div>Awesome! Thanks for putting this together!<br></div><div><br></div><div>I installed 4.3.30-0.6.el7.terabithia yesterday evening and there was no xymongen crash this morning (no e-mail and no core dump)! I'll keep an eye out the next couple days, but looks like these changes have fixed the crashes we've been seeing!</div><div><br></div><div>Thanks for your help as always!<br></div><div><br></div><div><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div>--</div><div>Matt Vander Werf</div></div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Aug 27, 2019 at 4:15 PM Japheth Cleaver <<a href="mailto:cleaver@terabithia.org">cleaver@terabithia.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
  
    
  
  <div bgcolor="#FFFFFF">
    <div class="gmail-m_-8704481539804250623moz-cite-prefix">Alright, I believe I have fixed the
      issue here... Multiple issues within the availability code (fixed
      in <a class="gmail-m_-8704481539804250623moz-txt-link-freetext" href="https://sourceforge.net/p/xymon/code/8081/" target="_blank">https://sourceforge.net/p/xymon/code/8081/</a>), and then a typo in
      a Terabithia patch. Please try out 4.3.30-0.6 in the /testing/
      repo if possible. <br>
    </div>
    <div class="gmail-m_-8704481539804250623moz-cite-prefix"><br>
    </div>
    <div class="gmail-m_-8704481539804250623moz-cite-prefix">You can also manually perform a run of
      this by executing: `xymoncmd xymonreports.sh daily` as the xymon
      user. That should give a reproduceable crash in 4.3.30-0.5 and be
      clean in 4.3.30-0.6.</div>
    <div class="gmail-m_-8704481539804250623moz-cite-prefix"><br>
    </div>
    <div class="gmail-m_-8704481539804250623moz-cite-prefix">I've also built the EL7 packages on a
      CentOS 7 box, which should provide proper compatibility while
      we're in a mixed 7.6/7.7 state in the ecosystem.</div>
    <div class="gmail-m_-8704481539804250623moz-cite-prefix"><br>
    </div>
    <div class="gmail-m_-8704481539804250623moz-cite-prefix">Regards,</div>
    <div class="gmail-m_-8704481539804250623moz-cite-prefix">-jc<br>
    </div>
    <div class="gmail-m_-8704481539804250623moz-cite-prefix"><br>
    </div>
    <br>
    <p>On 8/23/2019 9:00 AM, Matt Vander Werf wrote:</p>
    <blockquote type="cite">
      
      <div dir="ltr">
        <div dir="ltr">
          <div>Hi JC,<br>
          </div>
          <div><br>
          </div>
          <div>Unfortunately, this didn't seem to fix the crashes. Today
            I got the e-mail at 1:05 AM, though the core file has a
            timestamp of 1:04 AM. This time frame still matches up with
            it being the dailyreport task that is triggering the crashes
            (since there are no crashes any other time of the day).</div>
          <div><br>
          </div>
          <div>[root@<xymon server> ~]# xymoncmd --version<br>
            Xymon version 4.3.30-0.5.el7.terabithia<br>
            [root@<xymon server> ~]# cat /etc/redhat-release <br>
            Red Hat Enterprise Linux Server release 7.7 (Maipo)<br>
          </div>
          <div><br>
          </div>
          <div>The latest core backtrace looks to be the same as
            previously (same client and service too), but I'm including
            it here [1] just for completeness.</div>
          <div><br>
          </div>
          <div>Let me know if there's anything else I can provide to
            debug this.</div>
          <div><br>
          </div>
          <div>Thanks!</div>
          <div><br>
          </div>
          <div><br>
          </div>
          <div>[1]</div>
          <div>[root@<xymon server> ~]# gdb -q
            /usr/libexec/xymon/xymongen core.1312<br>
            Reading symbols from /usr/libexec/xymon/xymongen...Reading
            symbols from
            /usr/lib/debug/usr/libexec/xymon/xymongen.debug...done.<br>
            done.<br>
            [New LWP 1312]<br>
            [Thread debugging using libthread_db enabled]<br>
            Using host libthread_db library "/lib64/libthread_db.so.1".<br>
            Core was generated by `/usr/libexec/xymon/xymongen
            --reportopts=1566446400:1566532799:0:nongr --recent'.<br>
            Program terminated with signal 6, Aborted.<br>
            #0  0x00007fc746f7e377 in __GI_raise (sig=sig@entry=6) at
            ../nptl/sysdeps/unix/sysv/linux/raise.c:55<br>
            55  return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);<br>
            (gdb) bt<br>
            #0  0x00007fc746f7e377 in __GI_raise (sig=sig@entry=6) at
            ../nptl/sysdeps/unix/sysv/linux/raise.c:55<br>
            #1  0x00007fc746f7fa68 in __GI_abort () at abort.c:90<br>
            #2  0x00005559fc1ce4f5 in sigsegv_handler
            (signum=<optimized out>) at sig.c:57<br>
            #3  <signal handler called><br>
            #4  strchrnul () at ../sysdeps/x86_64/strchrnul.S:33<br>
            #5  0x00007fc746f90681 in __find_specmb (format=0xfce
            <Address 0xfce out of bounds>) at printf-parse.h:109<br>
            #6  _IO_vfprintf_internal (s=s@entry=0x7ffeb3e6d340, <br>
                format=format@entry=0xfce <Address 0xfce out of
            bounds>, ap=ap@entry=0x7ffeb3e6d478) at vfprintf.c:1308<br>
            #7  0x00007fc74705dc78 in ___vsprintf_chk (s=0x7ffeb3e6d6c2
            "", flags=1, slen=18446744073709551615, <br>
                format=0xfce <Address 0xfce out of bounds>,
            args=args@entry=0x7ffeb3e6d478) at vsprintf_chk.c:83<br>
            #8  0x00007fc74705dbcd in ___sprintf_chk (s=<optimized
            out>, flags=flags@entry=1, <br>
                slen=slen@entry=18446744073709551615,
            format=<optimized out>) at sprintf_chk.c:32<br>
            #9  0x00005559fc1bf96a in sprintf (__fmt=<optimized
            out>, __s=<optimized out>)<br>
                at /usr/include/bits/stdio2.h:33<br>
            #10 parse_histlogfile (starttime=1566446400, <br>
                timespec=0x5559fc431f50 <timespec.7157>
            "Wed_Sep_2_19:34:55_2015", servicename=0x5559fd61b2d0
            "procs", <br>
                hostname=0x5559fdc94520 "<client hostname>") at
            availability.c:174<br>
            #11 parse_historyfile (fd=fd@entry=0x5559fdc9be00,
            repinfo=<optimized out>, <br>
                hostname=0x5559fdc94520 "<client hostname>",
            servicename=0x5559fd61b2d0 "procs", <br>
                fromtime=<optimized out>, totime=1566532799,
            for_history=for_history@entry=0, warnlevel=97, <br>
                greenlevel=99.995000000000005, warnstops=-1,
            reporttime=0x0) at availability.c:475<br>
            #12 0x00005559fc1b496c in init_state (filename=<optimized
            out>, <br>
                filename@entry=0x7ffeb3e7f950 "<client
            hostname>.procs", log=log@entry=0x7ffeb3e7f860)<br>
                at loaddata.c:275<br>
            #13 0x00005559fc1b568e in load_state
            (sumhead=sumhead@entry=0x5559fc3fad48 <dispsums>) at
            loaddata.c:626<br>
            #14 0x00005559fc1af794 in main (argc=5, argv=0x7ffeb3e84b58)
            at xymongen.c:599</div>
          <div><br>
          </div>
          <div><br>
          </div>
          <div>
            <div>
              <div dir="ltr" class="gmail-m_-8704481539804250623gmail_signature">
                <div>--</div>
                <div>Matt Vander Werf</div>
              </div>
            </div>
            <br>
          </div>
        </div>
        <br>
        <div class="gmail_quote">
          <div dir="ltr" class="gmail_attr">On Thu, Aug 22, 2019 at 5:29
            PM Matt Vander Werf <<a href="mailto:matt1299@gmail.com" target="_blank">matt1299@gmail.com</a>> wrote:<br>
          </div>
          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
            <div dir="ltr">
              <div>Hi JC,</div>
              <div><br>
              </div>
              <div>Ah ha! That is one place I did not look and the
                timing certainly matches up!</div>
              <div><br>
              </div>
              <div>I have installed that new version on my Xymon server
                (running actual RHEL 7) and we'll see how it fares
                tomorrow morning....</div>
              <div><br>
              </div>
              <div>Thanks!<br>
              </div>
              <div><br>
              </div>
              <div>
                <div>
                  <div dir="ltr" class="gmail-m_-8704481539804250623gmail-m_6220780008439530303m_4473488323533301526gmail_signature">
                    <div>--</div>
                    <div>Matt Vander Werf</div>
                  </div>
                </div>
                <br>
              </div>
            </div>
            <br>
            <div class="gmail_quote">
              <div dir="ltr" class="gmail_attr">On Thu, Aug 22, 2019 at
                5:12 PM Japheth Cleaver <<a href="mailto:cleaver@terabithia.org" target="_blank">cleaver@terabithia.org</a>>
                wrote:<br>
              </div>
              <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                <div bgcolor="#FFFFFF">
                  <div class="gmail-m_-8704481539804250623gmail-m_6220780008439530303gmail-m_4473488323533301526gmail-m_745574700413372489moz-cite-prefix">Hi,</div>
                  <div class="gmail-m_-8704481539804250623gmail-m_6220780008439530303gmail-m_4473488323533301526gmail-m_745574700413372489moz-cite-prefix"><br>
                  </div>
                  <div class="gmail-m_-8704481539804250623gmail-m_6220780008439530303gmail-m_4473488323533301526gmail-m_745574700413372489moz-cite-prefix">I
                    think this might be xymongen in report mode from the
                    "dailyreport" file in /tasks.d/; the timing would
                    check out.  I believe the problem here is one of the
                    Terabithia patches now doing the wrong thing after
                    some of the string-handling changes in 4.3.29 --
                    causing core dumps in certain situations.</div>
                  <div class="gmail-m_-8704481539804250623gmail-m_6220780008439530303gmail-m_4473488323533301526gmail-m_745574700413372489moz-cite-prefix"><br>
                  </div>
                  <div class="gmail-m_-8704481539804250623gmail-m_6220780008439530303gmail-m_4473488323533301526gmail-m_745574700413372489moz-cite-prefix">If
                    you're running actual RHEL7 on this (not CentOS,
                    which hasn't released 7.7 yet), would you mind
                    checking the xymon-4.3.30-0.5 package in the EL7
                    Terabithia testing repo and see if this helps?</div>
                  <div class="gmail-m_-8704481539804250623gmail-m_6220780008439530303gmail-m_4473488323533301526gmail-m_745574700413372489moz-cite-prefix"><a href="https://repo.terabithia.org/rpms/xymon/testing/el7/x86_64/" target="_blank">https://repo.terabithia.org/rpms/xymon/testing/el7/x86_64/</a></div>
                  <div class="gmail-m_-8704481539804250623gmail-m_6220780008439530303gmail-m_4473488323533301526gmail-m_745574700413372489moz-cite-prefix"><br>
                  </div>
                  <div class="gmail-m_-8704481539804250623gmail-m_6220780008439530303gmail-m_4473488323533301526gmail-m_745574700413372489moz-cite-prefix">Regards,</div>
                  <div class="gmail-m_-8704481539804250623gmail-m_6220780008439530303gmail-m_4473488323533301526gmail-m_745574700413372489moz-cite-prefix">-jc<br>
                  </div>
                  <br>
                  <div class="gmail-m_-8704481539804250623gmail-m_6220780008439530303gmail-m_4473488323533301526gmail-m_745574700413372489moz-cite-prefix">On
                    8/22/2019 11:34 AM, Matt Vander Werf wrote:<br>
                  </div>
                  <blockquote type="cite">
                    <div dir="ltr">
                      <div>Hi Torsten,</div>
                      <div><br>
                      </div>
                      <div>No, there wasn't anything running from cron
                        or anything else around that time, let alone
                        anything that restarts the network or Xymon.</div>
                      <div><br>
                      </div>
                      <div>Thanks.</div>
                      <div><br>
                      </div>
                      <div>
                        <div>
                          <div dir="ltr" class="gmail-m_-8704481539804250623gmail-m_6220780008439530303gmail-m_4473488323533301526gmail-m_745574700413372489m_-4652670879172286430gmail_signature">
                            <div>--</div>
                            <div>Matt Vander Werf</div>
                          </div>
                        </div>
                        <br>
                      </div>
                    </div>
                    <br>
                    <div class="gmail_quote">
                      <div dir="ltr" class="gmail_attr">On Wed, Aug 21,
                        2019 at 5:43 AM Torsten Richter <<a href="mailto:bb4@richter-it.net" target="_blank">bb4@richter-it.net</a>>
                        wrote:<br>
                      </div>
                      <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                        <div>
                          <p>Hi Matt,<br>
                          </p>
                          <p>dumb question: is there any cron job
                            running at this time that is restarting
                            XYmon fiddling with the network, like
                            restarting the network for some reason?<br>
                          </p>
                          <p>Regards,<br>
                            Torsten<br>
                          </p>
                          <blockquote type="cite">Matt Vander Werf <<a href="mailto:matt1299@gmail.com" target="_blank">matt1299@gmail.com</a>>
                            hat am 20. August 2019 um 17:10 geschrieben:
                            <br>
                            <br>
                            <div dir="ltr">
                              <div>Hi all,</div>
                              <div><br>
                              </div>
                              <div>Every day since we updated our Xymon
                                server to 4.3.29 (from 4.3.28), I've
                                gotten an e-mail alert due to xymond
                                turning red that reads:</div>
                              <div><br>
                              </div>
                              <div style="margin-left:40px">red xymongen
                                program crashed <br>
                                <br>
                                Fatal signal caught!</div>
                              <div><br>
                              </div>
                              <div>The strange thing is that this has
                                happened at 1:04 AM every day...like
                                clockwork. I have xymongen set to run
                                every 1 minute and it has no problems
                                running any other time of the day. We
                                are using the Terabithia RPMs and the
                                Xymon server is running RHEL 7. <br>
                              </div>
                              <div><br>
                              </div>
                              <div>I've scoured the system to find
                                anything that is set to run at/around
                                that time via cron, etc. and haven't
                                found anything. The system logs don't
                                show anything is happening around that
                                time either.</div>
                              <div><br>
                              </div>
                              <div>I turned on debug logging for xymond
                                and xymongen and haven't been able to
                                find anything unusual in either logs
                                around that time. But it is dumping core
                                files for xymongen every time it
                                crashes.</div>
                              <div><br>
                              </div>
                              <div>I used gdb to get the backtrace on
                                all of the core files (so far) and I've
                                found that they all show the same thing.
                                It shows the same host in the backtrace
                                too (although I'm farily confident it
                                isn't specific or isolated to that host
                                but just the first one it runs into that
                                it has issues with when processing).</div>
                              <div><br>
                              </div>
                              <div>I've included an example gdb output
                                below (the most recent one) [1]. <br>
                              </div>
                              <div><br>
                              </div>
                              <div>Is anyone else running into this by
                                chance? Or any idea what might be the
                                cause? <br>
                              </div>
                              <div><br>
                              </div>
                              <div>Thanks!</div>
                              <div><br>
                              </div>
                              <div><br>
                              </div>
                              <div>[1]</div>
                              <div># gdb -q /usr/libexec/xymon/xymongen
                                core.16327 <br>
                                Reading symbols from
                                /usr/libexec/xymon/xymongen...Reading
                                symbols from
                                /usr/lib/debug/usr/libexec/xymon/xymongen.debug...done.
                                <br>
                                done. <br>
                                [New LWP 16327] <br>
                                [Thread debugging using libthread_db
                                enabled] <br>
                                Using host libthread_db library
                                "/lib64/libthread_db.so.1". <br>
                                Core was generated by
                                `/usr/libexec/xymon/xymongen
                                --reportopts=1566187200:1566273599:0:nongr
                                --recent'. <br>
                                Program terminated with signal 6,
                                Aborted. <br>
                                #0  0x00007f4657c49377 in __GI_raise
                                (sig=sig@entry=6) at
                                ../nptl/sysdeps/unix/sysv/linux/raise.c:55
                                <br>
                                55  return INLINE_SYSCALL (tgkill, 3,
                                pid, selftid, sig); <br>
                                (gdb) bt <br>
                                #0  0x00007f4657c49377 in __GI_raise
                                (sig=sig@entry=6) at
                                ../nptl/sysdeps/unix/sysv/linux/raise.c:55
                                <br>
                                #1  0x00007f4657c4aa68 in __GI_abort ()
                                at abort.c:90 <br>
                                #2  0x00005589375dd455 in
                                sigsegv_handler (signum=<optimized
                                out>) at sig.c:57 <br>
                                #3  <signal handler called> <br>
                                #4  strchrnul () at
                                ../sysdeps/x86_64/strchrnul.S:33 <br>
                                #5  0x00007f4657c5b681 in __find_specmb
                                (format=0xfce <Address 0xfce out of
                                bounds>) at printf-parse.h:109 <br>
                                #6  _IO_vfprintf_internal
                                (s=s@entry=0x7ffd5dabcc00, <br>
                                    format=format@entry=0xfce
                                <Address 0xfce out of bounds>,
                                ap=ap@entry=0x7ffd5dabcd38) at
                                vfprintf.c:1308 <br>
                                #7  0x00007f4657d28c78 in
                                ___vsprintf_chk (s=0x7ffd5dabcf82 "",
                                flags=1, slen=18446744073709551615, <br>
                                    format=0xfce <Address 0xfce out
                                of bounds>,
                                args=args@entry=0x7ffd5dabcd38) at
                                vsprintf_chk.c:83 <br>
                                #8  0x00007f4657d28bcd in ___sprintf_chk
                                (s=<optimized out>,
                                flags=flags@entry=1, <br>
                                   
                                slen=slen@entry=18446744073709551615,
                                format=<optimized out>) at
                                sprintf_chk.c:32 <br>
                                #9  0x00005589375ce8ca in sprintf
                                (__fmt=<optimized out>,
                                __s=<optimized out>) <br>
                                    at /usr/include/bits/stdio2.h:33 <br>
                                #10 parse_histlogfile
                                (starttime=1566187200, <br>
                                    timespec=0x558937840f50
                                <timespec.7157>
                                "Wed_Sep_2_19:34:55_2015",
                                servicename=0x5589383b6d70 "procs", <br>
                                    hostname=0x558938a335d0 "<client
                                hostname>") at availability.c:174 <br>
                                #11 parse_historyfile
                                (fd=fd@entry=0x558938a3aea0,
                                repinfo=<optimized out>, <br>
                                    hostname=0x558938a335d0 "<client
                                hostname>",
                                servicename=0x5589383b6d70 "procs", <br>
                                    fromtime=<optimized out>,
                                totime=1566273599,
                                for_history=for_history@entry=0,
                                warnlevel=97, <br>
                                    greenlevel=99.995000000000005,
                                warnstops=-1, reporttime=0x0) at
                                availability.c:475 <br>
                                #12 0x00005589375c38cc in init_state
                                (filename=<optimized out>, <br>
                                    filename@entry=0x7ffd5dacf210
                                "<client hostname>.procs",
                                log=log@entry=0x7ffd5dacf120) <br>
                                    at loaddata.c:275 <br>
                                #13 0x00005589375c45ee in load_state
                                (sumhead=sumhead@entry=0x558937809d48
                                <dispsums>) at loaddata.c:626 <br>
                                #14 0x00005589375be6f4 in main (argc=5,
                                argv=0x7ffd5dad4418) at xymongen.c:599</div>
                              <div><br>
                              </div>
                              <div><br>
                              </div>
                              <div>
                                <div dir="ltr" class="gmail-m_-8704481539804250623gmail-m_6220780008439530303gmail-m_4473488323533301526gmail-m_745574700413372489gmail-m_-4652670879172286430gmail-m_4039567731768702631ox-3601463be8-gmail_signature">
                                  <div>-- <br>
                                  </div>
                                  <div>Matt Vander Werf</div>
                                </div>
                              </div>
                            </div>
_______________________________________________ <br>
                            Xymon mailing list <br>
                            <a href="mailto:Xymon@xymon.com" target="_blank">Xymon@xymon.com</a>
                            <br>
                            <a href="http://lists.xymon.com/mailman/listinfo/xymon" target="_blank">http://lists.xymon.com/mailman/listinfo/xymon</a>
                            <br>
                          </blockquote>
                        </div>
                      </blockquote>
                    </div>
                    <br>
                    <fieldset class="gmail-m_-8704481539804250623gmail-m_6220780008439530303gmail-m_4473488323533301526gmail-m_745574700413372489mimeAttachmentHeader"></fieldset>
                    <pre class="gmail-m_-8704481539804250623gmail-m_6220780008439530303gmail-m_4473488323533301526gmail-m_745574700413372489moz-quote-pre">_______________________________________________
Xymon mailing list
<a class="gmail-m_-8704481539804250623gmail-m_6220780008439530303gmail-m_4473488323533301526gmail-m_745574700413372489moz-txt-link-abbreviated" href="mailto:Xymon@xymon.com" target="_blank">Xymon@xymon.com</a>
<a class="gmail-m_-8704481539804250623gmail-m_6220780008439530303gmail-m_4473488323533301526gmail-m_745574700413372489moz-txt-link-freetext" href="http://lists.xymon.com/mailman/listinfo/xymon" target="_blank">http://lists.xymon.com/mailman/listinfo/xymon</a>
</pre>
                  </blockquote>
                  <p><br>
                  </p>
                </div>
              </blockquote>
            </div>
          </blockquote>
        </div>
      </div>
    </blockquote>
    <p><br>
    </p>
  </div>

</blockquote></div></div>