<div dir="ltr"><div><div><div><div>Hi JC, Matt <br><br>Good news:<br><br>Last friday I first upgraded to 4.3.28, but the spiky behavior immediately showed up. So I think this is not Xymon-version specific. <br><br>Then I did as JC suggested, dis/en-able debug & en/dis-abling the cache. Since there is an SSD involved on my xymon server the impact is minimal and there is no production running.<br><br></div>This fixes both issues!<br><br></div>1) The devmon/xymon related thing and the gaps for the graphs disappears as <u>soon as I disabled the caching</u> (--no-cache). As you say, not something I want for long, but now we can have a specific look (A) why and (B) where caching is a problem. I think that is good news!<br><br><br></div></div>2) I expect the memory leak error solved, as the release notes said, but that will only show up over time (weeks).<br><div><div><div><div><div><br><br>3) The enable debugging showed me another problem in the, self-modified, 
netapp.pl-script. I reverted my change and now there are no more 
spurious<div class="gmail_quote" style="font-family:arial,sans-serif"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="gmail_extra" style="font-family:arial,sans-serif"><div class="gmail_quote" style="font-family:arial,sans-serif"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div style="font-family:arial,sans-serif"><div class="gmail-m_6475655309710647818m_296021847781068685h5" style="font-family:arial,sans-serif">xstatvolume,____-rrd-files anymore filling up my diskspace. </div></div></blockquote></div></div></blockquote><div><br></div><div>This is an error I introduced myself and mailed in November 2016 on the list. Sorry for this. <br></div><br></div><div class="gmail_quote" style="font-family:arial,sans-serif">Very happy now and hoping we can tackle the cache problem so I can enable the launching of the rrd-deamons.<br></div><div class="gmail_quote" style="font-family:arial,sans-serif"><br>Peter<br></div></div></div></div></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">2017-03-21 16:36 GMT+01:00 Japheth Cleaver <span dir="ltr"><<a href="mailto:cleaver@terabithia.org" target="_blank">cleaver@terabithia.org</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div bgcolor="#FFFFFF" text="#000000"><div><div class="h5">
    <br>
    <blockquote type="cite">
      <div class="gmail_extra">
        <div class="gmail_quote">
          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div dir="ltr">
              <div class="gmail_extra">
                <div class="gmail_quote">
                  <div>
                    <div class="m_991631201724033675h5">On Fri, Mar 17, 2017 at 8:56 AM,
                      Peter Welter <span dir="ltr"><<a href="mailto:peter.welter@gmail.com" target="_blank">peter.welter@gmail.com</a>></span>
                      wrote:<br>
                    </div>
                  </div>
                  <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                    <div>
                      <div class="m_991631201724033675h5">
                        <div dir="ltr">Hi JC,<br>
                          <br>
                          I'm still experiencing some difficulties with
                          Xymon version (4.3.27-1.el6.terabithia)
                          software, that is being deployed from <a href="http://terabithia.org/rpms/xymon/el6/i686/" target="_blank">http://terabithia.org/rpms/xym<wbr>on/el6/i686/</a>.<br>
                          <br>
                          There are two different types of problems:<br>
                          <br>
                          1) Has to do with the integration of
                          Xymon/Devmon.<br>
                          <br>
                             Although Devmon gets valid SNMP-data, for
                          each poll, the values in the
                          if_load.Ethernet3_1.rrd-file (for example) are
                          showing gaps. The next value is so much larger
                          than the rest, so the total graph is going
                          beserk because of the spikes that are being
                          shown.<br>
                             <br>
                             ...[snip]<br>
                                      <!-- 2017-03-15 15:10:00 CET /
                          1489587000 -->
                          <row><v>5.7197560484e+01</v><v<wbr>>5.7540255376e+01</v></row><br>
                                      <!-- 2017-03-15 15:15:00 CET /
                          1489587300 -->
                          <row><v>5.8052253788e+01</v><v<wbr>>5.7062462121e+01</v></row><br>
                                      <!-- 2017-03-15 15:20:00 CET /
                          1489587600 -->
                          <row><v>5.8039204545e+01</v><v<wbr>>5.7738579545e+01</v></row><br>
                                      <!-- 2017-03-15 15:25:00 CET /
                          1489587900 -->
                          <row><v>5.8352395833e+01</v><v<wbr>>5.7912187500e+01</v></row><br>
                                      <!-- 2017-03-15 15:30:00 CET /
                          1489588200 -->
                          <row><v>5.7961458333e+01</v><v<wbr>>5.8807500000e+01</v></row><br>
                                      <!-- 2017-03-15 15:35:00 CET /
                          1489588500 -->
                          <row><v>5.7040675403e+01</v><v<wbr>>5.7108262769e+01</v></row><br>
                                      <!-- 2017-03-15 15:40:00 CET /
                          1489588800 -->
                          <row><v>5.7984999119e+01</v><v<wbr>>5.8214662436e+01</v></row><br>
                                      <!-- 2017-03-15 15:45:00 CET /
                          1489589100 -->
                          <row><v>1.6832224569e+16</v><v<wbr>>1.6832224569e+16</v></row><br>
                                      <!-- 2017-03-15 15:50:00 CET /
                          1489589400 -->
                          <row><v>4.4656922344e+16</v><v<wbr>>4.4656922343e+16</v></row><br>
                                      <!-- 2017-03-15 15:55:00 CET /
                          1489589700 -->
                          <row><v>5.7648150173e+01</v><v<wbr>>5.7687031165e+01</v></row><br>
                                      <!-- 2017-03-15 16:00:00 CET /
                          1489590000 -->
                          <row><v>5.9068884188e+01</v><v<wbr>>5.9453689406e+01</v></row><br>
                                      <!-- 2017-03-15 16:05:00 CET /
                          1489590300 -->
                          <row><v>NaN</v><v>NaN</v></row<wbr>><br>
                                      <!-- 2017-03-15 16:10:00 CET /
                          1489590600 -->
                          <row><v>NaN</v><v>NaN</v></row<wbr>><br>
                                      <!-- 2017-03-15 16:15:00 CET /
                          1489590900 -->
                          <row><v>NaN</v><v>NaN</v></row<wbr>><br>
                                      <!-- 2017-03-15 16:20:00 CET /
                          1489591200 -->
                          <row><v>NaN</v><v>NaN</v></row<wbr>><br>
                                      <!-- 2017-03-15 16:25:00 CET /
                          1489591500 -->
                          <row><v>NaN</v><v>NaN</v></row<wbr>><br>
                                      <!-- 2017-03-15 16:30:00 CET /
                          1489591800 -->
                          <row><v>1.9398478192e+07</v><v<wbr>>1.8707899982e+07</v></row><br>
                                      <!-- 2017-03-15 16:35:00 CET /
                          1489592100 -->
                          <row><v>5.6938284153e+01</v><v<wbr>>5.6770437158e+01</v></row><br>
                                      <!-- 2017-03-15 16:40:00 CET /
                          1489592400 -->
                          <row><v>NaN</v><v>NaN</v></row<wbr>><br>
                                      <!-- 2017-03-15 16:45:00 CET /
                          1489592700 -->
                          <row><v>NaN</v><v>NaN</v></row<wbr>><br>
                                      <!-- 2017-03-15 16:50:00 CET /
                          1489593000 -->
                          <row><v>NaN</v><v>NaN</v></row<wbr>><br>
                                      <!-- 2017-03-15 16:55:00 CET /
                          1489593300 -->
                          <row><v>NaN</v><v>NaN</v></row<wbr>><br>
                                      <!-- 2017-03-15 17:00:00 CET /
                          1489593600 -->
                          <row><v>NaN</v><v>NaN</v></row<wbr>><br>
                                      <!-- 2017-03-15 17:05:00 CET /
                          1489593900 -->
                          <row><v>NaN</v><v>NaN</v></row<wbr>><br>
                                      <!-- 2017-03-15 17:10:00 CET /
                          1489594200 -->
                          <row><v>NaN</v><v>NaN</v></row<wbr>><br>
                                      <!-- 2017-03-15 17:15:00 CET /
                          1489594500 -->
                          <row><v>NaN</v><v>NaN</v></row<wbr>><br>
                                      <!-- 2017-03-15 17:20:00 CET /
                          1489594800 -->
                          <row><v>NaN</v><v>NaN</v></row<wbr>><br>
                                      <!-- 2017-03-15 17:25:00 CET /
                          1489595100 -->
                          <row><v>3.5775056887e+07</v><v<wbr>>3.4501518955e+07</v></row><br>
                                      <!-- 2017-03-15 17:30:00 CET /
                          1489595400 -->
                          <row><v>5.7219344262e+01</v><v<wbr>>5.7417704918e+01</v></row><br>
                                      <!-- 2017-03-15 17:35:00 CET /
                          1489595700 -->
                          <row><v>5.7166338798e+01</v><v<wbr>>5.9383825137e+01</v></row><br>
                                      <!-- 2017-03-15 17:40:00 CET /
                          1489596000 -->
                          <row><v>5.6769617486e+01</v><v<wbr>>5.6981202186e+01</v></row><br>
                                      <!-- 2017-03-15 17:45:00 CET /
                          1489596300 -->
                          <row><v>5.7549617486e+01</v><v<wbr>>5.7382732240e+01</v></row><br>
                              ...[snip]<br>
                              This behaviour does NOT occur on my
                          current Xymon server (version 4.2.3) running
                          on SLES11 SP4.<br>
                              <br>
                              First I thought that this has to do with
                          vmware, but that is not the case. VM or bare
                          metal; the behaviour is the same.<br>
                              <br>
                              I made sure to see that even the devmon
                          module is not causing the problems. The same
                          devmon software works fine on SLES and RHEL.
                          The snmpwalk-command does get valid SNMP-data,
                          when writing to a files. It just seems that
                          Xymon does not update the rrd-file
                          correctly!?!?<br>
                              <br>
                              Any suggestions how to proceed?</div>
                      </div>
                    </div>
                  </blockquote>
                </div>
              </div>
            </div>
          </blockquote>
        </div>
      </div>
    </blockquote>
    <br>
    <br></div></div>
    Assuming that the numeric values are correct for the time periods
    that are coming in, my first thought would be that there's something
    unusual going on with RRD cacheing. Are you seeing this issue with
    other trends graphs, either for other tests on this host, other
    hosts using this test/data, or any other graphs period?<br>
    <br>
    If it's unique to this, then that speaks to a problem with this
    specific data transmission. If not, there could be a larger issue
    with xymond_rrd (I/O performance, for example). I'd start with
    enabling debug output and examining the logs for when it's receiving
    data for this test. (Not sure if this is being sent via 'data' or
    'status' messages, but you'll want to make sure you're enabling
    debug for the right copy of xymond_rrd.)<br>
    <br>
    If nothing there, then you might try disabling the cache, which will
    force xymond_rrd to write things out as received (but will also
    increase I/O load a lot).<br>
    <br>
    If neither of those fix it, there could actually be an issue with
    the data coming in. At about that point I would set up a channel
    listener looking specifically for the host.svc messages related to
    this source so I could physically see the contents of each one
    coming in and look for any anomalies.<br>
    <br>
    HTH,<br>
    -jc<br>
  </div>

</blockquote></div><br></div>