[hobbit] strange graph behavior - random machines & graphs

Gary Baluha gumby3203 at gmail.com
Fri Nov 30 21:07:23 CET 2007


On Nov 30, 2007 2:14 PM, Gary Baluha <gumby3203 at gmail.com> wrote:

> Hmm, now this is interesting.  I have the Hobbit server (Hobbit A, from a
> previous post) monitoring my work laptop (mostly so I can test out
> client-side external scripts).  I have been taking my laptop home with me
> this week, and I noticed that the time period while I'm *at* work, the
> graphs are plotting valid data.  However, during the time that I turn my
> laptop off and bring it home, to the time that I bring my laptop in the next
> day and power it on, the graphs are showing the same invalid bogus data that
> the other bad graphs are showing.
>
> In other words, the rrd graphs are getting bogus data for a machine that
> isn't even reporting to the Hobbit server!  Interesting, isn't it?
>

I'm definitely on to something with this.  I intentionally stopped the
Hobbit client process on one of the machines that has the bad RRD graphs for
about 20 minutes, and then started it back up.  Once the client reported the
latest data back, the RRD graph had another spike in it!

The other interesting thing is, the hobbitd-rrd --debug logging (
rrd-status.log) does *not* show any abnormal data.  It appears that Hobbit
is logging valid data to "rrdupdate".  So the bogus data appears to be
down-stream of this.

So it seems these data spikes *do* correspond to something: they correspond
to a lack of data reported back from the clients.  Furthermore, when I do an
rrd dump, I can see the bogus data in the "secondary_value" field:

-----Start of RRD dump-----
<!-- Round Robin Archives -->   <rra>
                <cf> AVERAGE </cf>
                <pdp_per_row> 1 </pdp_per_row> <!-- 300 seconds -->

                <params>
                <xff> 5.0000000000e-01 </xff>
                </params>
                <cdp_prep>
                        <ds>
                        <primary_value> 2.6110000000e+01 </primary_value>
                        <secondary_value> 5.1776682516e+170</secondary_value>
                        <value> 5.1776682516e+170 </value>
                        <unknown_datapoints> 0 </unknown_datapoints>
                        </ds>
                </cdp_prep>
                <database>
                        <!-- 2007-11-28 15:05:00 EST / 1196280300 -->
<row><v> 5.1776682516e+170 </v></row>
-----SNIP-----
-----End of RRD dump-----

The number 5.1776682516e+170 corresponds to the "517768..." large number
that the GPRINT portion of the rrd graphs are displaying.

Anyone have any ideas of what else to turn logging up on?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20071130/d5b2fff4/attachment.html>


More information about the Xymon mailing list