Adding a custom RRD to graphs and monitoring
Matthew Moldvan
mmoldvan at csc.com
Thu Aug 19 17:26:08 CEST 2010
All,
This is stressing me out, hopefully someone takes the time to go through
my ramblings below and help me out. Lots of information so please bear
with me.
For the past few days I've been trying to add a custom script (iostat
information) and have the data graphed, but I'm not having any luck
(mostly due to not understanding the RRD definitions in hobbitgraph.cfg).
I've read through a ton of the how-tos on the subject, but all of them
seem to vary a bit on the details. My resulting graphs look like this:
http://imgur.com/4Nwrp.jpg
So far I've got a script running on two systems reporting data back to the
main page. This brings up my first question: When sending information to
be graphed, is the data passed in as a bb status message or a bb data
message?
I thought I had it working at one point by sending similar data below
through a status message, but I'd like to pass only a status message and
HTML through the "bb status" command and keep the actual data passed for
the RRD in the "bb data" command if that works. I also tried wrapping the
data below in HTML comments as below, but no luck.
"<!---
data like below (note newlines between HTML comment tags)
--->"
Sample data:
c1t50060E80104AAE50d1 : 0.82
c1t50060E80104AAE50d2 : 0.07
c1t50060E80104AAE50d3 : 1.71
c1t50060E80104AAE50d4 : 0.46
c3t50060E80104AAE52d0 : 1.31
c3t50060E80104AAE52d1 : 1.53
c3t50060E80104AAE52d2 : 0.09
c3t50060E80104AAE52d3 : 3.14
c3t50060E80104AAE52d4 : 0.61
c3t50060E80104AAE52d12 : 0.06
c0t0d0 : 11.70
c1t50060E80104AAE50d0 : 0.87
I've seen it both ways in the examples. I tried sending both, but that
doesn't seem to be working. From what I understand if I specify a test as
NCV in the TEST2RRD section, one of the running processes (hobbitd or
hobbitrrd) will read in the "name : value" pair and pass that to an RRD
update/create command? Does that require integer values or are floating
point up to a certain precision acceptable? Currently I'm passing .2f
from the nawk script and getting a bunch of "nans" in the RRD output
(could be various reasons, though).
Here go the details (NOTE: All host names and IP addresses have been
scrubbed to protect the innocent):
Script output:
+ /opt/xymon/client/bin/bb <xymon.server.ip> 'data <client.fqdn>.trends
c1t50060E80104AAE50d1 : 0.82
c1t50060E80104AAE50d2 : 0.07
c1t50060E80104AAE50d3 : 1.71
c1t50060E80104AAE50d4 : 0.46
c3t50060E80104AAE52d0 : 1.31
c3t50060E80104AAE52d1 : 1.53
c3t50060E80104AAE52d2 : 0.09
c3t50060E80104AAE52d3 : 3.14
c3t50060E80104AAE52d4 : 0.61
c3t50060E80104AAE52d12 : 0.06
c0t0d0 : 11.70
c1t50060E80104AAE50d0 : 0.87
'
+ /opt/xymon/client/bin/bb <xymon.server.ip> 'status <client.fqdn>.iostat
green Thu Aug 19 10:47:28 EDT 2010
c1t50060E80104AAE50d1 : 0.82
c1t50060E80104AAE50d2 : 0.07
c1t50060E80104AAE50d3 : 1.71
c1t50060E80104AAE50d4 : 0.46
c3t50060E80104AAE52d0 : 1.31
c3t50060E80104AAE52d1 : 1.53
c3t50060E80104AAE52d2 : 0.09
c3t50060E80104AAE52d3 : 3.14
c3t50060E80104AAE52d4 : 0.61
c3t50060E80104AAE52d12 : 0.06
c0t0d0 : 11.70
c1t50060E80104AAE50d0 : 0.87
'
Another question: I've seen some examples sending as "bb data
<client.fqdn>.trends", is that correct, or if I'm using the "bb data"
command do I have to specify the test name as above?
The RRD files are thus being created for every disk as such:
-rw-r--r-- 1 xymon 495 19648 Aug 19 11:06 iostat,c0t0d0.rrd
-rw-r--r-- 1 xymon 495 19648 Aug 18 23:22 iostat,c0t1d0.rrd
-rw-r--r-- 1 xymon 495 19648 Aug 19 11:06
iostat,c1t50060E80104AAE50d0.rrd
...snip...
-rw-r--r-- 1 xymon 495 19648 Aug 18 23:22
iostat,c3t50060E80104AAE52d8.rrd
-rw-r--r-- 1 xymon 495 19648 Aug 18 23:22
iostat,c3t50060E80104AAE52d9.rrd
An rrdtool dump <whatever>.rrd does confirm that some values are making it
into the RRDs (assuming so by "last_ds" in dump output below):
[root@<hostname> <fqdn.rrd.dir>]# rrdtool dump
iostat,c3t50060E80104AAE52d9.rrd | more
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE rrd SYSTEM "http://oss.oetiker.ch/rrdtool/rrdtool.dtd">
<!-- Round Robin Database Dump -->
<rrd>
<version>0003</version>
<step>300</step> <!-- Seconds -->
<lastupdate>1282188121</lastupdate> <!-- 2010-08-18 23:22:01 EDT
-->
<ds>
<name> lambda </name>
<type> GAUGE </type>
<minimal_heartbeat>0</minimal_heartbeat>
<min>6.0000000000e+02</min>
<max>NaN</max>
<!-- PDP Status -->
<last_ds>2.05</last_ds>
<value>NaN</value>
<unknown_sec> 121 </unknown_sec>
</ds>
<!-- Round Robin Archives -->
<rra>
<cf>AVERAGE</cf>
<pdp_per_row>1</pdp_per_row> <!-- 300 seconds -->
<params>
<xff>5.0000000000e-01</xff>
</params>
<cdp_prep>
<ds>
<primary_value>NaN</primary_value>
<secondary_value>0.0000000000e+00</secondary_value>
<value>NaN</value>
<unknown_datapoints>0</unknown_datapoints>
</ds>
</cdp_prep>
<database>
<!-- 2010-08-16 23:25:00 EDT / 1282015500 -->
<row><v>NaN</v></row>
...snip, all others are NaN also...
<!-- 2010-08-18 23:20:00 EDT / 1282188000 -->
<row><v>NaN</v></row>
</database>
</rra>
<rra>
<cf>AVERAGE</cf>
<pdp_per_row>6</pdp_per_row> <!-- 1800 seconds -->
<params>
<xff>5.0000000000e-01</xff>
</params>
<cdp_prep>
<ds>
<primary_value>0.0000000000e+00</primary_value>
<secondary_value>0.0000000000e+00</secondary_value>
<value>NaN</value>
<unknown_datapoints>4</unknown_datapoints>
</ds>
</cdp_prep>
<database>
<!-- 2010-08-06 23:30:00 EDT / 1281151800 -->
<row><v>NaN</v></row>
...snip, all NaNs til the end...
Relevant lines from /etc/xymon/hobbitserver.cfg:
[root@<hostname> ~]# egrep 'TEST2RRD|GRAPHS' /etc/xymon/hobbitserver.cfg
# TEST2RRD defines the status- and data-messages you want to collect RRD
data
TEST2RRD="cpu=la,disk,inode,qtree,memory,$PINGCOLUMN=tcp,http=tcp,dns=tcp,dig=tcp,time=ntpstat,vmstat,vmio=ncv,
iostat=ncv
,netstat,temperature,apache,bind,sendmail,mailq,nmailq=mailq,socks,bea,iishealth,citrix,bbgen,bbtest,bbproxy,hobbitd,files,procs=processes,ports,clock,lines,ops,stats,cifs,JVM,JMS,HitCache,Session,JDBCConn,ExecQueue,JTA,TblSpace,RollBack,MemReq,InvObj,snapmirr,snaplist,snapshot"
GRAPHS="la,disk,inode,qtree,files,processes,memory,users,vmstat:vmstat0|vmstat1|vmstat2|vmstat3|vmstat4|vmstat5|vmstat6|vmstat7|vmstat8|vmstat9,
iostat
,vmio,tcp.http,tcp,netstat,ifstat,mrtg::1,ports,temperature,ntpstat,apache,bind,sendmail,mailq,socks,bea,iishealth,citrix,bbgen,bbtest,bbproxy,hobbitd,clock,lines,ops,stats,cifs,JVM,JMS,HitCache,Session,JDBCConn,ExecQueue,JTA,TblSpace,RollBack,MemReq,InvObj,snapmirr,snaplist,snapshot,devmon::1,if_load::1,temp,
ncv"
- (a tip from the web said "ncv" had to be in the GRAPHS portion and said
"not sure why just trust me" ...)
Relevant lines from /etc/xymon/hobbitgraph.cfg:
[iostat]
TITLE I/O Utilization - Overall
FNPATTERN iostat(.*).rrd
YAXIS Stats
DEF:p at RRDIDX@=@RRDFN@:lambda:AVERAGE
LINE1.5:p at RRDIDX@#@COLOR@:@RRDPARAM@
GPRINT:p at RRDIDX@:AVERAGE: \: %5.1lf (avg)\n
Anyone know of a link that explains some of the terminology above? I
checked the rrdcreate man page, but didn't see the parts about "@RRDIDX@"
and @RRDFN@" and the other stuff. p at RRDIDX@ seems to be in a lot of
examples I've seen, and all my data is making it in with those variables
(is that what they are?) without having multiple DEF statements.
The above is generating the image I included a link to above (
http://imgur.com/4Nwrp.jpg).
Thanks again to anyone that can help out ... I've been pulling my hair out
about this for a few days.
Regards,
Matt.
Unix System Administrator
Computer Science Corporation
This is a PRIVATE message. If you are not the intended recipient, please
delete without copying and kindly advise us by e-mail of the mistake in
delivery.
NOTE: Regardless of content, this e-mail shall not operate to bind CSC to
any order or other contract unless pursuant to explicit written agreement
or government initiative expressly permitting the use of e-mail for such
purpose.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20100819/8e3580a2/attachment.html>
More information about the Xymon
mailing list