[hobbit] Graphs are missing data, but it's there!

Hubbard, Greg L greg.hubbard at eds.com
Fri Jul 25 16:28:19 CEST 2008


I have not been able to get the Hobbit graph thing to work with negative
numbers.  If you are using the "manual" script method for parsing the
return, you should be able to save the output from the failing server in
a file, then run the processing script by hand to see what it spits out
(should be commands for the Hobbit RRD support to obey).  I have spent
many an hour debugging my own stuff this way...
 
GLH


________________________________

	From: Ward, Martin [mailto:Martin.Ward at colt.net] 
	Sent: Friday, July 25, 2008 8:55 AM
	To: hobbit at hswn.dk
	Subject: RE: [hobbit] Graphs are missing data, but it's there!
	
	
	OK, with everyone's help I have made progress. After trying all
the different suggestions it came down to: Why can't I get an "rrdtool
dump" output? The reason was that sometime in the past someone (probably
me) managed to replace the rrdtool binary with an empty file (stop
sniggering at the back please).
	 
	Having done this before I know how it happens... you type your
command line:
	/opt/rrdtool/bin/rrdtool dump postfixqueue.rrd
	 
	but when using bash command line editing you manage to put a >
at the start, making the command line:
	>/opt/rrdtool/bin/rrdtool dump postfixqueue.rrd
	 
	The file still keeps its execute permission, but executing an
empty file returns nothing...
	 
	So, having got a real, working copy of the rrdtool program and
running it on the dodgy data file I can see that data is indeed being
stored there, and a vast number of lines look like this:
	<!-- 2008-07-25 12:45:00 UTC / 1216989900 --> <row><v> NaN
</v><v> NaN </v><v> NaN </v><v> NaN </v><v> NaN </v></row>
	 
	a copy from one of the working ones shows:
	
	<!-- 2008-07-25 13:00:00 UTC / 1216990800 --> <row><v>
5.5427200000e+03 </v><v> 2.1861333333e+02 </v><v> 1.4601324333e+05
</v><v> 0.0000000000e+00 </v><v> 4.0939317667e+05 </v></row>
	 
	So it seems to be a problem with translating the output from the
client program into data that RRD can understand.
	 
	Now, here is the contents of the hostlogs file of the working
server, this should tie up with the data entry above:
	----
	red Friday July 25 12:59:31 UTC 2008
	 
	<br><br>
	 
	<pre>
	 
	ActiveStatus: &red
	ActiveQueue: 5494
	ActiveTrend: tendency <b>rising</b> with <b>-81</b> mails.
	BounceStatus: &green
	BounceQueue: 219
	BounceTrend: tendency <b>rising</b> <b>-2</b> mails.
	DeferStatus: &green
	DeferQueue: 145971
	DeferTrend: amount equal to last measure.
	CorruptStatus: &green
	CorruptQueue: 0
	CorruptTrend: amount equal to last measure.
	IncomingStatus: &red
	IncomingQueue: 409494
	IncomingTrend: tendency <b>falling</b> with <b>858</b> mails.
	 
	 
	 
	</pre>
	Status unchanged in 0.00 minutes
	Message received from 10.44.107.107
	Client data ID 1216990657
	----
	and here are the contents of the non-working one:
	----
	red Friday July 25 12:45:04 UTC 2008
	 
	<br><br>
	 
	<pre>
	 
	ActiveStatus: &green
	ActiveQueue: 39
	ActiveTrend: tendency <b>falling</b> with <b>973</b> mails.
	BounceStatus: &green
	BounceQueue: 58
	BounceTrend: amount equal to last measure.
	DeferStatus: &red
	DeferQueue: 154348
	DeferTrend: tendency <b>falling</b> with <b>865</b> mails.
	CorruptStatus: &green
	CorruptQueue: 0
	CorruptTrend: amount equal to last measure.
	IncomingStatus: &red
	IncomingQueue: 206927
	IncomingTrend: tendency <b>rising</b> with <b>-206926</b> mails.
	 
	 Deferred Queue is too high but is decreasing already.<br>
	 
	</pre>
	Status unchanged in 0.00 minutes
	Message received from 10.44.107.105
	Client data ID 1216989837
	----
	 
	As mentioned previously all these servers use the same scripts
to send the data to the server and the same scripts to process it once
it arrives, indeed as you can see above the two different entries look
identical in format. I checked the scripts on the remote servers to see
if there were any differences between them and found a few minor
differences but nothing huge. Still, just to be sure I copied the
postfixqueue.sh script from a working server to the broken one and
waited for it to run. Alas, although the script transmits sensible data
back to the Hobbit server:
	----
	
	
	ActiveStatus: &green
	ActiveQueue: 448
	ActiveTrend: tendency falling with 9 mails.
	BounceStatus: &green
	BounceQueue: 59
	BounceTrend: tendency rising -1 mails.
	DeferStatus: &green
	DeferQueue: 149697
	DeferTrend: amount equal to last measure.
	CorruptStatus: &green
	CorruptQueue: 0
	CorruptTrend: amount equal to last measure.
	IncomingStatus: &red
	IncomingQueue: 213848
	IncomingTrend: amount equal to last measure.
	----
	The rrd file STILL contains:
	<!-- 2008-07-25 13:45:00 UTC / 1216993500 --> <row><v> NaN
</v><v> NaN </v><v> NaN </v><v> NaN </v><v> NaN </v></row>
	 
	Any RRD experts got any ideas?
	 
	|\/|artin
	 

		-----Original Message-----
		From: Phil Wild [mailto:philwild at gmail.com] 
		Sent: 24 July 2008 17:42
		To: hobbit at hswn.dk
		Subject: Re: [hobbit] Graphs are missing data, but it's
there!
		
		
		The rrd version should be okay, after all it is graphing
data from other hosts with no problem.
		
		It would appear that you ncv and graph configurations
are correct as you say they are working for other hosts. This would
indicate it is a problem with this host's configuration, so where to
look...
		
		Just out of interest, can you take an rrd file this test
from a host that works, and copy it into the .../data/rrd/hostname
directory of the host that does not?
		
		I would expect after doing this that you will have a
graph for this host. Can you confirm this works? After doing this and
leaving it for 10 minutes, do you see any new data in the graph?
		
		Can you dump the data from this rrd file?
		
		
		2008/7/25 Ward, Martin <Martin.Ward at colt.net>:
		

			> Are you saying that you run the same tests on
multiple hosts and only one host in not showing data?
			Yes.
			 
			> Does this mean they all share the same NCV
configuration in hobberserver.cfg and the same graph definition in
hobbitgraph.cfg?
			Yes.
			 
			> What if you remove the rrd file and let hobbit
create a new one, does that help? 
			I did this and as you'd expect initially the web
page showed no graph although it did show data (stored from the previous
run I presume).
			 
			After an interval the file appeared again but
running "rrdtool dump" on it STILL failed to produce any data.
			 
			I'm starting to wonder about the versions of
RRD, but they ought to be data-compatible; I'm using rrdtool v1.2.15.
			 
			The histlogs show no errors, the hist/mc25,...
data file contains valid data. I DO get a few RRD errors like this:
			 
			rrd-status.log:2008-07-21 09:46:19 RRD error
updating /opt/hobbit/data/rrd/mc25.lon.server.colt.net/tcp.smtp.rrd from
10.44.107.48: illegal attempt to update using time 1216633579 when last
update time is 1216633579 (minimum one second step)
			
			which make it look like Hobbit is actually
updating the RRD file... I just can't get any data out!
			 
			|\/|
			 

				
				-----Original Message-----
				From: Phil Wild
[mailto:philwild at gmail.com] 
				Sent: 24 July 2008 16:31
				To: hobbit at hswn.dk
				
				Subject: Re: [hobbit] Graphs are missing
data, but it's there!
				
				
				Are you saying that you run the same
tests on multiple hosts and only one host in not showing data? Does this
mean they all share the same NCV configuration in hobberserver.cfg and
the same graph definition in hobbitgraph.cfg?
				
				If this is correct, then it really
points to something not getting into the rrd file. As previously
suggested, rrd dump is your best bet at finding the problem here. What
if you remove the rrd file and let hobbit create a new one, does that
help? 
				
				Cheers
				
				Phil
				
				
				2008/7/24 Hubbard, Greg L
<greg.hubbard at eds.com>:
				

				You know the data exists because you
used the rrd dump tool to display it?
				 
				Is the graph simply not shown at all, or
is there a "hole" in the Web page where it normally would go?  ("show
page source" might have a clue).
				 
				Some ideas/shots in the dark:
				 
				a) check the logs
				 
				b) meticulously compare a "working"
system to the non-working system, and make sure that they really are
identical.
				 
				c) look at the trends page for this host
to see if the graph is okay there...
				 
				Etc.  I am sure you know the drill -- a
big pain to look under every rock, but it has to be done...
				 
				GLH


________________________________

				
				From: Ward, Martin
[mailto:Martin.Ward at colt.net] 
				
				Sent: Thursday, July 24, 2008 8:21 AM 

				To: hobbit at hswn.dk
				Subject: RE: [hobbit] Graphs are missing
data, but it's there!
				

				Thanks for the suggestion but that
didn't work (I guess you meant rrd).
				 
				Any other ideas?
				 
				|\/|

				-----Original Message-----
				From: Roberts, James
[mailto:James.Roberts at hants.gov.uk] 
				Sent: 24 July 2008 12:47
				To: hobbit at hswn.dk
				Subject: RE: [hobbit] Graphs are missing
data, but it's there!
				
				
				you need to touch all the rdd.
				
				
________________________________

				From: Ward, Martin
[mailto:Martin.Ward at colt.net] 
				Sent: 24 July 2008 12:43
				To: hobbit at hswn.dk
				Subject: [hobbit] Graphs are missing
data, but it's there!
				
				

				All, 

				I have a problem with one machine where
its data is not being shown in the graphs even though the data exists. 

				The machine in question's Hobbit client
sends five pieces of numeric data (email queues) and these are displayed
on the web page for this service:

				==== 
				Thursday July 24 11:29:11 UTC 2008 
				
				ActiveStatus:  green
<http://hbt0.lon.oss.colt.net/hobbit/gifs/green.gif> 
				ActiveQueue: 106
				ActiveTrend: tendency rising with -60
mails.
				BounceStatus:  green
<http://hbt0.lon.oss.colt.net/hobbit/gifs/green.gif> 
				BounceQueue: 58
				BounceTrend: tendency falling with 3
mails.
				DeferStatus:  red
<http://hbt0.lon.oss.colt.net/hobbit/gifs/red.gif> 
				DeferQueue: 150464
				DeferTrend: tendency falling with 95
mails.
				CorruptStatus:  green
<http://hbt0.lon.oss.colt.net/hobbit/gifs/green.gif> 
				CorruptQueue: 0
				CorruptTrend: amount equal to last
measure.
				IncomingStatus:  red
<http://hbt0.lon.oss.colt.net/hobbit/gifs/red.gif> 
				IncomingQueue: 247049
				IncomingTrend: amount equal to last
measure.
				
				 Deferred Queue is too high but is
decreasing already. 

				==== 
				These numbers change over time and the
values are accurate. 

				However, the graph that is displayed
below this data is blank. I have historic data, the files exist, and
what is more I have other machines that are configured identically to
this one where the data IS graphed correctly.

				Hobbit graphs are a bit of a black hole
to me, can anyone suggest where I might look? 

				|\/|artin 

				
				
	
************************************************************************
*************
				The message is intended for the named
addressee only and may not be disclosed to or used by anyone else, nor
may it be copied in any way. 
				
				The contents of this message and its
attachments are confidential and may also be subject to legal privilege.
If you are not the named addressee and/or have received this message in
error, please advise us by e-mailing security at colt.net and delete the
message and any attachments without retaining any copies. 
				
				Internet communications are not secure
and COLT does not accept responsibility for this message, its contents
nor responsibility for any viruses. 
				
				No contracts can be created or varied on
behalf of COLT Telecommunications, its subsidiaries or affiliates
("COLT") and any other party by email Communications unless expressly
agreed in writing with such other party. 
				
				Please note that incoming emails will be
automatically scanned to eliminate potential viruses and unsolicited
promotional emails. For more information refer to www.colt.net or
contact us on +44(0)20 7390 3900.
				




				-- 
				Tel: 0400 466 952
				Fax: 0433 123 226
				email: philwild AT gmail.com 


			To unsubscribe from the hobbit list, send an
e-mail to
			hobbit-unsubscribe at hswn.dk
			
			




		-- 
		Tel: 0400 466 952
		Fax: 0433 123 226
		email: philwild AT gmail.com 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20080725/99c84fa1/attachment.html>


More information about the Xymon mailing list