[Xymon] rrd logs and graphs
Jeremy Laidman
jlaidman at rebel-it.com.au
Sat Mar 21 10:48:08 CET 2015
So the URLs are different? But both have service=power in the URLs?
On Sat, 21 Mar 2015 10:16 Vernon Everett <everett.vernon at gmail.com> wrote:
> Hi Jeremy
>
> That thought occurred to me, but I checked.
> There is only one [power] entry in the graphs.cfg file.
> And I put it there for this particular test.
>
> Would have made this one too easy if it was that. :-)
>
> Regards
> Vernon
>
>
> On 20 March 2015 at 16:43, Jeremy Laidman <jlaidman at rebel-it.com.au>
> wrote:
>
>> Vernon
>>
>> The power status page must refer to a different graph name in graphs.cfg
>> with a different FNPATTERN.
>>
>> Click on the graphs images for each version to get the 4-graph view and
>> compare the URLs.
>>
>> J
>>
>> On Fri, 20 Mar 2015 19:35 Vernon Everett <everett.vernon at gmail.com>
>> wrote:
>>
>>> Hi all
>>>
>>> I was only back at the client today, and unfortunately have not managed
>>> to get that patch in yet.
>>> (As I mentioned before, it's a production system)
>>>
>>> However, I did notice something really odd.
>>> I have focused my attention on the trends graphs, where I get all the
>>> extra values, but it's not happening in the test itself, despite the
>>> existence of the additional rrd files.
>>>
>>> Example.
>>> I have something that plots the power usage of the PSUs on a NetApp
>>> e-series.
>>> There are 4 PSUs, output looks like this.
>>>
>>> Total power drawn- 487 Watts
>>> Number of trays- 2
>>> Tray power input details-
>>>
>>> TRAY ID POWER SUPPLY SERIAL NUMBER INPUT POWER
>>> 99 0 145 Watts
>>> 99 1 151 Watts
>>> 0 0 99 Watts
>>> 0 1 92 Watts
>>>
>>> All good. And I have a graph with 4 lines. Min, Max, Curr and Avg values
>>> are all there. It looks beautiful.
>>> But go look at the power graph in trends, and it's ugly.
>>> Heaps of additional data lines with no entries. All values are NaN
>>> And mixed in amongst the additional empty graphs, are the 4 valid lines.
>>>
>>> I look at the rrd files, and they are all there, even the bad ones.
>>> Here's a few of them.
>>> power,tcpListenDrop.rrd
>>> power,tcpOutAck.rrd
>>> power,tcpOutDataSegs.rrd
>>> power,tcpOutRsts.rrd
>>> power,tcpOutUrg.rrd
>>> power,tcpOutWinProbe.rrd
>>> power,tcpRetransSegs.rrd
>>> power,tcpRtoMax.rrd
>>> power,tcpRttUpdate.rrd
>>> power,tcpTimKeepaliveProbe.rrd
>>> power,tcpTimRetransDrop.rrd
>>> power,Tray0_PSU0.rrd <--- Valid
>>> power,Tray0_PSU1.rrd <--- Valid
>>> power,Tray99_PSU0.rrd <--- Valid
>>> power,Tray99_PSU1.rrd <--- Valid
>>> power,trlogpool.rrd
>>> power,UDP_udpInDatagrams.rrd
>>> power,udpInCksumErrs.rrd
>>> power,udpOutDatagrams.rrd
>>> power,vnet.rrd
>>>
>>> So I thought I would check my configs.
>>> In xymonserver
>>> From TEST2RRD= ,power=ncv,
>>> From GRAPHS= ,power::9,
>>> And further down
>>> SPLITNCV_power="*:GAUGE"
>>>
>>> And in graphs.cfg
>>> [power]
>>> FNPATTERN power,(.*).rrd
>>> TITLE Database Power Consumption Per Tray PSU
>>> YAXIS Watts
>>> -l 0
>>> DEF:p at RRDIDX@=@RRDFN@:lambda:AVERAGE
>>> LINE2:p at RRDIDX@#@COLOR@:@RRDPARAM@
>>> GPRINT:p at RRDIDX@:LAST: \: %5.1lf (cur)
>>> GPRINT:p at RRDIDX@:MAX: \: %5.1lf (max)
>>> GPRINT:p at RRDIDX@:MIN: \: %5.1lf (min)
>>> GPRINT:p at RRDIDX@:AVERAGE: \: %5.1lf (avg)\n
>>>
>>> With luck I will get approval to recompile with the debugging bug-fix,
>>> and we can get more info, but I thought the extra entries in trends, but
>>> not in the test was interesting.
>>>
>>> Regards
>>> Vernon
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 13 March 2015 at 15:24, J.C. Cleaver <cleaver at terabithia.org> wrote:
>>>
>>>> On Wed, March 11, 2015 5:51 pm, Jeremy Laidman wrote:
>>>> > On 11 March 2015 at 14:18, Vernon Everett <everett.vernon at gmail.com>
>>>> > wrote:
>>>> >
>>>> >> About now, I am getting a little nervous adding send and expect,
>>>> because
>>>> >> unlike telnet and telnets, we are doing ldap and ldaps testing.
>>>> >>
>>>> >
>>>> > That's understandable. A read through the code suggests that at
>>>> least in
>>>> > some places, an empty string is equivalent to an undefined string, as
>>>> the
>>>> > string length (shown in Sendlen in the debug output) is zero in both
>>>> > cases. So until a patch is in place, a work-around might be to define
>>>> > empty "send" and "expect" strings for those that have none.
>>>> >
>>>> > Any suggestions?
>>>> >> I think we have some debug code update recommendations for JC though.
>>>> >> :-)
>>>> >>
>>>> >
>>>> > Here's my patch. I'll push this into the dev list for proposed
>>>> inclusion
>>>> > in a future release.
>>>> >
>>>> > --- lib/netservices.c.orig 2012-07-25 01:48:41.000000000 +1000
>>>> > +++ lib/netservices.c 2015-03-12 11:18:18.000000000 +1100
>>>> > @@ -328,9 +328,9 @@
>>>> > dbgprintf("Service list dump\n");
>>>> > for (i=0; (svcinfo[i].svcname); i++) {
>>>> > dbgprintf(" Name : %s\n", svcinfo[i].svcname);
>>>> > - dbgprintf(" Sendtext: %s\n",
>>>> binview(svcinfo[i].sendtxt,
>>>> > svcinfo[i].sendlen));
>>>> > + dbgprintf(" Sendtext: %s\n",
>>>> > svcinfo[i].sendtxt!=NULL?binview(svcinfo[i].sendtxt,
>>>> > svcinfo[i].sendlen):"[null]");
>>>> > dbgprintf(" Sendlen : %d\n", svcinfo[i].sendlen);
>>>> > - dbgprintf(" Exp.text: %s\n",
>>>> binview(svcinfo[i].exptext,
>>>> > svcinfo[i].explen));
>>>> > + dbgprintf(" Exp.text: %s\n",
>>>> > svcinfo[i].exptext!=NULL?binview(svcinfo[i].exptext,
>>>> > svcinfo[i].explen):"[null]");
>>>> > dbgprintf(" Exp.len : %d\n", svcinfo[i].explen);
>>>> > dbgprintf(" Exp.ofs : %d\n", svcinfo[i].expofs);
>>>> > dbgprintf(" Flags : %d\n", svcinfo[i].flags);
>>>> >
>>>> > This produces "[null]" where we would have seen "(null)" on a
>>>> GNU-based
>>>> > OS,
>>>> > to differentiate between the two situations.
>>>> >
>>>> > In the mean time, you could compile a special version of xymond_rrd,
>>>> and
>>>> > run it manually on the same data channel as the real one, but have it
>>>> make
>>>> > RRD files and log file to a different location. This shouldn't
>>>> interfere
>>>> > with your production Xymon. Here's one I prepared earlier that works
>>>> for
>>>> > me:
>>>> >
>>>> > sudo -u xymon mkdir /tmp/my-rrd-data/
>>>> > sudo -u xymon xymoncmd /bin/sh -c 'XYMONTMP=/tmp;
>>>> > /usr/lib/xymon/server/bin/xymond_channel --channel=data
>>>> > --log=/tmp/my-rrd-data.log /path/to/xymond_rrd_debug_patch
>>>> > --rrddir=/tmp/my-rrd-data/ --debug'
>>>> >
>>>> > This seems to show some really useful stuff that's relevant to solving
>>>> > your
>>>> > problem. Some sample debug lines:
>>>> >
>>>> > 15306 2015-03-12 11:36:28 xymond_rrd_debug_patch: Got message 165619
>>>> >
>>>> @@data#165619/servername|1426120588.401891|172.16.0.1||servername|vmstat|sunos|ABC
>>>> > ...
>>>> > 15306 2015-03-12 11:36:28 Creating rrd
>>>> > /tmp/my-rrd-data//servername/vmstat.rrd
>>>> > 15306 2015-03-12 11:36:28 RRD create param 00: 'rrdcreate'
>>>> > 15306 2015-03-12 11:36:28 RRD create param 01:
>>>> > '/tmp/my-rrd-data//servername/vmstat.rrd'
>>>> > 15306 2015-03-12 11:36:28 RRD create param 02: '-s'
>>>> > 15306 2015-03-12 11:36:28 RRD create param 03: '300'
>>>> > 15306 2015-03-12 11:36:28 RRD create param 04:
>>>> 'DS:cpu_r:GAUGE:600:0:U'
>>>> > 15306 2015-03-12 11:36:28 RRD create param 05:
>>>> 'DS:cpu_b:GAUGE:600:0:U'
>>>> > 15306 2015-03-12 11:36:28 RRD create param 06:
>>>> 'DS:cpu_w:GAUGE:600:0:U'
>>>> > ...
>>>> > 15306 2015-03-12 11:39:42 Got 265 bytes
>>>> > 15306 2015-03-12 11:39:42 xymond_rrd_debug_patch: Got message 165737
>>>> >
>>>> @@data#165737/servername|1426120782.080244|172.16.0.2||servername|trends||DEF
>>>> > 15306 2015-03-12 11:39:42 startpos 216644, fillpos 216644, endpos -1
>>>> > 15306 2015-03-12 11:39:42 Flushing
>>>> > '/servername/tcp.xopiy90404.parameter.rrd' with 1 updates pending,
>>>> > template
>>>> > 'sec'
>>>> > 15306 2015-03-12 11:39:42 Want msg 165738, startpos 216644, fillpos
>>>> > 216644,
>>>> > endpos -1, usedbytes=0, bufleft=1884603
>>>> >
>>>> > J
>>>> >
>>>>
>>>>
>>>> This is some excellent sleuthing! :)
>>>>
>>>> As I was pouring through the thread (sorry, I've been out the last few
>>>> days), I failed to take note of the SPARC-Enterprise-T2000 in the
>>>> output.
>>>>
>>>>
>>>> The patch below should fix the immediate issue triggered by debug
>>>> mode...
>>>> letting us move on to the larger oddness. Unfortunately, I have a
>>>> feeling
>>>> there are other occasions where we're relying on GNU's printf(NULL)
>>>> printing that out and thus might be caught by this. As I find them, I go
>>>> ahead and work to put fixes in.
>>>>
>>>> In the meantime, this will be in 4.3.19 and can be patched directly from
>>>> below.
>>>>
>>>>
>>>> HTH,
>>>>
>>>> -jc
>>>>
>>>>
>>>> --- lib/netservices.c (revision 7598)
>>>> +++ lib/netservices.c (working copy)
>>>> @@ -81,9 +81,9 @@
>>>> unsigned char *inp, *outp;
>>>> int i;
>>>>
>>>> - if (!buf) return NULL;
>>>> + if (result) xfree(result);
>>>> + if (!buf) { result = strdup("[null]"); return result; }
>>>>
>>>> - if (result) xfree(result);
>>>> if (buf && (buflen == 0)) buflen = strlen(buf);
>>>> result = (char *)malloc(4*buflen + 1); /* Worst case: All
>>>> binary */
>>>>
>>> _______________________________________________
>>>> Xymon mailing list
>>>> Xymon at xymon.com
>>>> http://lists.xymon.com/mailman/listinfo/xymon
>>>>
>>>
>>> --
>>> "Accept the challenges so that you can feel the exhilaration of victory"
>>> - General George Patton
>>>
>>
>
>
> --
> "Accept the challenges so that you can feel the exhilaration of victory"
> - General George Patton
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20150321/547863a9/attachment.html>
More information about the Xymon
mailing list