[Xymon] rrd logs and graphs

Jeremy Laidman jlaidman at rebel-it.com.au
Fri Mar 20 09:43:59 CET 2015


Vernon

The power status page must refer to a different graph name in graphs.cfg
with a different FNPATTERN.

Click on the graphs images for each version to get the 4-graph view and
compare the URLs.

J

On Fri, 20 Mar 2015 19:35 Vernon Everett <everett.vernon at gmail.com> wrote:

> Hi all
>
> I was only back at the client today, and unfortunately have not managed to
> get that patch in yet.
> (As I mentioned before, it's a production system)
>
> However, I did notice something really odd.
> I have focused my attention on the trends graphs, where I get all the
> extra values, but it's not happening in the test itself, despite the
> existence of the additional rrd files.
>
> Example.
> I have something that plots the power usage of the PSUs on a NetApp
> e-series.
> There are 4 PSUs, output looks like this.
>
> Total power drawn- 487 Watts
> Number of trays- 2
> Tray power input details-
>
>    TRAY ID  POWER SUPPLY SERIAL NUMBER   INPUT POWER
>    99       0                            145 Watts
>    99       1                            151 Watts
>    0        0                            99 Watts
>    0        1                            92 Watts
>
> All good. And I have a graph with 4 lines. Min, Max, Curr and Avg values
> are all there. It looks beautiful.
> But go look at the power graph in trends, and it's ugly.
> Heaps of additional data lines with no entries. All values are NaN
> And mixed in amongst the additional empty graphs, are the 4 valid lines.
>
> I look at the rrd files, and they are all there, even the bad ones.
> Here's a few of them.
> power,tcpListenDrop.rrd
> power,tcpOutAck.rrd
> power,tcpOutDataSegs.rrd
> power,tcpOutRsts.rrd
> power,tcpOutUrg.rrd
> power,tcpOutWinProbe.rrd
> power,tcpRetransSegs.rrd
> power,tcpRtoMax.rrd
> power,tcpRttUpdate.rrd
> power,tcpTimKeepaliveProbe.rrd
> power,tcpTimRetransDrop.rrd
> power,Tray0_PSU0.rrd                  <--- Valid
> power,Tray0_PSU1.rrd                  <--- Valid
> power,Tray99_PSU0.rrd                 <--- Valid
> power,Tray99_PSU1.rrd                 <--- Valid
> power,trlogpool.rrd
> power,UDP_udpInDatagrams.rrd
> power,udpInCksumErrs.rrd
> power,udpOutDatagrams.rrd
> power,vnet.rrd
>
> So I thought I would check my configs.
> In xymonserver
> From TEST2RRD= ,power=ncv,
> From GRAPHS=  ,power::9,
> And further down
> SPLITNCV_power="*:GAUGE"
>
> And in graphs.cfg
> [power]
>     FNPATTERN power,(.*).rrd
>     TITLE Database Power Consumption Per Tray PSU
>     YAXIS Watts
>     -l 0
>     DEF:p at RRDIDX@=@RRDFN@:lambda:AVERAGE
>     LINE2:p at RRDIDX@#@COLOR@:@RRDPARAM@
>     GPRINT:p at RRDIDX@:LAST: \: %5.1lf (cur)
>     GPRINT:p at RRDIDX@:MAX: \: %5.1lf (max)
>     GPRINT:p at RRDIDX@:MIN: \: %5.1lf (min)
>     GPRINT:p at RRDIDX@:AVERAGE: \: %5.1lf (avg)\n
>
> With luck I will get approval to recompile with the debugging bug-fix, and
> we can get more info, but I thought the extra entries in trends, but not in
> the test was interesting.
>
> Regards
> Vernon
>
>
>
>
>
>
>
>
> On 13 March 2015 at 15:24, J.C. Cleaver <cleaver at terabithia.org> wrote:
>
>> On Wed, March 11, 2015 5:51 pm, Jeremy Laidman wrote:
>> > On 11 March 2015 at 14:18, Vernon Everett <everett.vernon at gmail.com>
>> > wrote:
>> >
>> >> About now, I am getting a little nervous adding send and expect,
>> because
>> >> unlike telnet and telnets, we are doing ldap and ldaps testing.
>> >>
>> >
>> > That's understandable.  A read through the code suggests that at least
>> in
>> > some places, an empty string is equivalent to an undefined string, as
>> the
>> > string length (shown in Sendlen in the debug output) is zero in both
>> > cases.  So until a patch is in place, a work-around might be to define
>> > empty "send" and "expect" strings for those that have none.
>> >
>> > Any suggestions?
>> >> I think we have some debug code update recommendations for JC though.
>> >> :-)
>> >>
>> >
>> >  Here's my patch.  I'll push this into the dev list for proposed
>> inclusion
>> > in a future release.
>> >
>> > --- lib/netservices.c.orig      2012-07-25 01:48:41.000000000 +1000
>> > +++ lib/netservices.c   2015-03-12 11:18:18.000000000 +1100
>> > @@ -328,9 +328,9 @@
>> >         dbgprintf("Service list dump\n");
>> >         for (i=0; (svcinfo[i].svcname); i++) {
>> >                 dbgprintf(" Name      : %s\n", svcinfo[i].svcname);
>> > -               dbgprintf("   Sendtext: %s\n",
>> binview(svcinfo[i].sendtxt,
>> > svcinfo[i].sendlen));
>> > +               dbgprintf("   Sendtext: %s\n",
>> > svcinfo[i].sendtxt!=NULL?binview(svcinfo[i].sendtxt,
>> > svcinfo[i].sendlen):"[null]");
>> >                 dbgprintf("   Sendlen : %d\n", svcinfo[i].sendlen);
>> > -               dbgprintf("   Exp.text: %s\n",
>> binview(svcinfo[i].exptext,
>> > svcinfo[i].explen));
>> > +               dbgprintf("   Exp.text: %s\n",
>> > svcinfo[i].exptext!=NULL?binview(svcinfo[i].exptext,
>> > svcinfo[i].explen):"[null]");
>> >                 dbgprintf("   Exp.len : %d\n", svcinfo[i].explen);
>> >                 dbgprintf("   Exp.ofs : %d\n", svcinfo[i].expofs);
>> >                 dbgprintf("   Flags   : %d\n", svcinfo[i].flags);
>> >
>> > This produces "[null]" where we would have seen "(null)" on a GNU-based
>> > OS,
>> > to differentiate between the two situations.
>> >
>> > In the mean time, you could compile a special version of xymond_rrd, and
>> > run it manually on the same data channel as the real one, but have it
>> make
>> > RRD files and log file to a different location.  This shouldn't
>> interfere
>> > with your production Xymon.  Here's one I prepared earlier that works
>> for
>> > me:
>> >
>> > sudo -u xymon mkdir /tmp/my-rrd-data/
>> > sudo -u xymon xymoncmd /bin/sh -c 'XYMONTMP=/tmp;
>> > /usr/lib/xymon/server/bin/xymond_channel --channel=data
>> > --log=/tmp/my-rrd-data.log /path/to/xymond_rrd_debug_patch
>> > --rrddir=/tmp/my-rrd-data/ --debug'
>> >
>> > This seems to show some really useful stuff that's relevant to solving
>> > your
>> > problem.  Some sample debug lines:
>> >
>> > 15306 2015-03-12 11:36:28 xymond_rrd_debug_patch: Got message 165619
>> >
>> @@data#165619/servername|1426120588.401891|172.16.0.1||servername|vmstat|sunos|ABC
>> > ...
>> > 15306 2015-03-12 11:36:28 Creating rrd
>> > /tmp/my-rrd-data//servername/vmstat.rrd
>> > 15306 2015-03-12 11:36:28 RRD create param 00: 'rrdcreate'
>> > 15306 2015-03-12 11:36:28 RRD create param 01:
>> > '/tmp/my-rrd-data//servername/vmstat.rrd'
>> > 15306 2015-03-12 11:36:28 RRD create param 02: '-s'
>> > 15306 2015-03-12 11:36:28 RRD create param 03: '300'
>> > 15306 2015-03-12 11:36:28 RRD create param 04: 'DS:cpu_r:GAUGE:600:0:U'
>> > 15306 2015-03-12 11:36:28 RRD create param 05: 'DS:cpu_b:GAUGE:600:0:U'
>> > 15306 2015-03-12 11:36:28 RRD create param 06: 'DS:cpu_w:GAUGE:600:0:U'
>> > ...
>> > 15306 2015-03-12 11:39:42 Got 265 bytes
>> > 15306 2015-03-12 11:39:42 xymond_rrd_debug_patch: Got message 165737
>> >
>> @@data#165737/servername|1426120782.080244|172.16.0.2||servername|trends||DEF
>> > 15306 2015-03-12 11:39:42 startpos 216644, fillpos 216644, endpos -1
>> > 15306 2015-03-12 11:39:42 Flushing
>> > '/servername/tcp.xopiy90404.parameter.rrd' with 1 updates pending,
>> > template
>> > 'sec'
>> > 15306 2015-03-12 11:39:42 Want msg 165738, startpos 216644, fillpos
>> > 216644,
>> > endpos -1, usedbytes=0, bufleft=1884603
>> >
>> > J
>> >
>>
>>
>> This is some excellent sleuthing! :)
>>
>> As I was pouring through the thread (sorry, I've been out the last few
>> days), I failed to take note of the SPARC-Enterprise-T2000 in the output.
>>
>>
>> The patch below should fix the immediate issue triggered by debug mode...
>> letting us move on to the larger oddness. Unfortunately, I have a feeling
>> there are other occasions where we're relying on GNU's printf(NULL)
>> printing that out and thus might be caught by this. As I find them, I go
>> ahead and work to put fixes in.
>>
>> In the meantime, this will be in 4.3.19 and can be patched directly from
>> below.
>>
>>
>> HTH,
>>
>> -jc
>>
>>
>> --- lib/netservices.c   (revision 7598)
>> +++ lib/netservices.c   (working copy)
>> @@ -81,9 +81,9 @@
>>         unsigned char *inp, *outp;
>>         int i;
>>
>> -       if (!buf) return NULL;
>> +       if (result) xfree(result);
>> +       if (!buf) { result = strdup("[null]"); return result; }
>>
>> -       if (result) xfree(result);
>>         if (buf && (buflen == 0)) buflen = strlen(buf);
>>         result = (char *)malloc(4*buflen + 1);  /* Worst case: All binary
>> */
>>
> _______________________________________________
>> Xymon mailing list
>> Xymon at xymon.com
>> http://lists.xymon.com/mailman/listinfo/xymon
>>
>
> --
> "Accept the challenges so that you can feel the exhilaration of victory"
> - General George Patton
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20150320/52b927e2/attachment.html>


More information about the Xymon mailing list