[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [hobbit] strange graph behavior - random machines & graphs

To: hobbit (at) hswn.dk
Subject: Re: [hobbit] strange graph behavior - random machines & graphs
From: "Gary Baluha" <gumby3203 (at) gmail.com>
Date: Fri, 30 Nov 2007 13:27:03 -0500
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; bh=KNro5vQp4bRtvYQAPLYtRfTvmrRC6WLghBfJ7fOjRFs=; b=qE0z09P9CbpYQTnNCZy3O45JftdeoP+PbR14RzWlhoKjh+HsowUNhHXk6l/h0Q+FgfoOCpAZz+zvCO3Ku0Y8wOU2lJJNQTAeiNptCLPkSSKpDrgKtArrokPpAUvfxypukBdgF9K6JU+Lxn+kMxYXhr4g1eLiB0vIZdAz+t/fNjw=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=hc9yjZ+jnGVtb1RB4297L/XDm9BQtVcPKQNSoiWS9IYr+ttIB4etbr+xO21oLQylmVlfAdD0op/21hmcGEQGTtlIeCRfVtxx17oySsvWR56brOBniWTU9yE+ttvu5Q5DVRFp7OZza3YW4WOBFpS5N0jtNhfFW1/zBVeHVNk643E=
References: <29f517690711280656x5c9fa38cta0e80f0d5761c1f7 (at) mail.gmail.com> <29f517690711280708k54276fd1me58d46d1e70ea600 (at) mail.gmail.com> <29f517690711300725g127fd5f7v148aa688764a4f94 (at) mail.gmail.com> <58EF0861D3A1A04182720B3A5231C7C201E022DB (at) usplm205.amer.corp.eds.com> <29f517690711300814q7a1ef9bew600c2d02d2d4de29 (at) mail.gmail.com> <29f517690711300855n7633ae02m7b93bf482f634992 (at) mail.gmail.com> <997a524e0711300918v204e8c30x4e74b37075d61a74 (at) mail.gmail.com>

On Nov 30, 2007 12:18 PM, Ralph Mitchell <ralphmitchell (at) gmail.com> wrote:

> On Nov 30, 2007 10:55 AM, Gary Baluha <gumby3203 (at) gmail.com> wrote:
>
> > Hmm, this is getting curiouser and curiouser.  Apparently at least
> > _some_ of the graphs that appear corrupted still have some valid data.  If I
> > use the graph zoom feature (clicking on the magnifying glass) and select
> > certain portions of the graph, the graph data shows up as normal.  It
> > appears that the problem is related to periodic data artifacts (the huge
> > numbers) that cause the scale of the graph to resize to show it within
> > bounds, and this causes the valid data to essentially disappear.
> >
> > I realized this when I looked at the graph, and saw that the (curr) and
> > (min) data points were showing normal values.  It's just the (max) and (avg)
> > values that are way off, which causes the rest of the graph to be incorrect.
> >
> >
>
>
> Have you tried running hobbitd_rrd with the "--debug" option??  Add it to
> the various hobbitd_rrd entries in server/etc/hobbitlaunch.cfg.  I haven't
> tried it myself, so I don't know how verbose it gets.  I seem to recall
> Henrik saying it's OK to just kill hobbitd_rrd processes because they get
> respawned.
>
> I guess the debug output shows up in the rrd-status.log in your Hobbit
> logs directory.  Is there anything interesting in that log already??  Or any
> other log??

There wasn't anything useful in any of the logs, besides the usual stuff.  I
turned on the --debug option, and here is a sample of the data for one of
the affected machines:

 2007-11-30 13:14:07 hobbitd_rrd: Got message 562165
@@status#562165|1196446447.724393|192.168.232.110||danno|disk|1196448247|yellow||yellow|1196053505|0||0||1196446447
2007-11-30 13:14:07 startpos 343968, fillpos 343968, endpos -1
2007-11-30 13:14:07 RRD update param 00: 'rrdupdate'
2007-11-30 13:14:07 RRD update param 01:
'/var/hobbit/data/rrd/danno/disk,dev,odm.rrd'
2007-11-30 13:14:07 RRD update param 02: '-t'
2007-11-30 13:14:07 RRD update param 03: 'pct:used'
2007-11-30 13:14:07 RRD update param 04: '1196446447:0:0'

I'm afraid I don't know how to interpret all of this, unfortunately.  I get
that the "param 03" means the graph is showing "percentage [disk space]
used", and that "param 01" means it is updating that specific rrd file.  And
I remember that "-t" in "param 02" is some rrdtool flag.  But I don't know
what the numbers in "param 04" mean.  I assume the first number is the #
seconds since 1970, and the second number is the current value, but I don't
know what the last number means.  Also, I'm not sure how to interpret all of
the data in the "@@status" line.

By the way, this excerpt is from a machine that is having the graph display
problems.  In this case, the data it is receiving is normal and correct.
I'm waiting for another update when the data is incorrect.

Follow-Ups:
- Re: [hobbit] strange graph behavior - random machines & graphs
  - From: Ralph Mitchell

References:
- [hobbit] strange graph behavior - random machines & graphs
  - From: Gary Baluha
- Re: [hobbit] strange graph behavior - random machines & graphs
  - From: Gary Baluha
- RE: [hobbit] strange graph behavior - random machines & graphs
  - From: Hubbard, Greg L
- Re: [hobbit] strange graph behavior - random machines & graphs
  - From: Gary Baluha
- Re: [hobbit] strange graph behavior - random machines & graphs
  - From: Gary Baluha
- Re: [hobbit] strange graph behavior - random machines & graphs
  - From: Ralph Mitchell

Prev by Date: Sending alerts to remote system running qpage
Next by Date: Re: [hobbit] strange graph behavior - random machines & graphs
Previous by thread: Re: [hobbit] strange graph behavior - random machines & graphs
Next by thread: Re: [hobbit] strange graph behavior - random machines & graphs
Index(es):
- Date
- Thread