[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [hobbit] strange graph behavior - random machines & graphs
- To: hobbit (at) hswn.dk
- Subject: Re: [hobbit] strange graph behavior - random machines & graphs
- From: "Gary Baluha" <gumby3203 (at) gmail.com>
- Date: Fri, 30 Nov 2007 11:14:41 -0500
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; bh=cGJ9W2wghc6YTX0SLm6xwQxMD6TQ2E0yKjHlULJxWlM=; b=VYMPsTf9mJP4YP8CRNvvgLzS87cqYg8FW+WROjJU1cqAPAw8CiigBvdFW6FWfb1yOc730jL4WShVNVNv7bM4uhxVTi1N8mH/add0yH/O8LCVCCfkp2OauazDkujl75lcBD9tpioqZMovsOTuVKZo6nqT7TBAmicZgtdFG332EiE=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=sM9YXe00/4p+5xrZ726geMHj995r81EuK67ujL9zncu48+Cg0pbt1UB81zmBYvBwgzWW+wAiQIR1U68s9fmdJMyveYyeJ6zolLaqC3quK/lqOVruJnTFc/QkG8BiQhNiNBckKJKxEhsOhldfGrXRN0MUULntNEPE8qavsVzcGRk=
- References: <29f517690711280656x5c9fa38cta0e80f0d5761c1f7 (at) mail.gmail.com> <29f517690711280708k54276fd1me58d46d1e70ea600 (at) mail.gmail.com> <29f517690711300725g127fd5f7v148aa688764a4f94 (at) mail.gmail.com> <58EF0861D3A1A04182720B3A5231C7C201E022DB (at) usplm205.amer.corp.eds.com>
On Nov 30, 2007 10:53 AM, Hubbard, Greg L <greg.hubbard (at) eds.com> wrote:
> Gary,
>
> This is pretty hard to decipher from "afar".
>
> I think I remember you saying that when you dump the data it is always
> okay?
>
Actually, it turns out this is not true. The rrd file does indeed have the
bad data. I just didn't notice it before, but now that it appears to be
getting worse, it is quite obvious to see the bad data.
Some wild thoughts:
>
> a) could there be two different processes updating the same RRD files?
>
I don't believe so. The strange thing is, all of the graphs that become
corrupted have the exact same large number that is being input into the rrd
data files.
> b) are all servers using the same version of rrdtool?
>
No. One is running 1.2.23, the other 1.2.26. Both have the problem.
> c) are the hobbitgraph files okay? I have proven to my satisfaction that
> hobbitgraph definition errors can make the graphs act funny.
>
They haven't changed since before the graphs were having this problem.
> d) if this stuff is on a SAN, can it be moved to local storage?
>
It is on the SAN on one of the machines, and locally on the other. I was
thinking of temporarily moving the data directory and have Hobbit regenerate
all the data from scratch. I'm trying to avoid this, since that would mean
losing a year's worth of trend data that has proven itself very useful.
Still, if it helps me narrow down the problem, I'll consider this (and move
the data back once I get my answer).
> I am just "fishing." Sometimes, when I am at my wit's end, I just change
> SOMETHING to see if it makes a difference. Even WORSE can help get me
> started.
>
> GLH
>
> ------------------------------
> *From:* Gary Baluha [mailto:gumby3203 (at) gmail.com]
> *Sent:* Friday, November 30, 2007 9:25 AM
> *To:* hobbit (at) hswn.dk
> *Subject:* Re: [hobbit] strange graph behavior - random machines & graphs
>
> Now this appears it is becoming a more serious problem. It seems more and
> more graphs are starting to be affected, and I still have no explanation for
> what is going on here. It also seems that almost any new graph that is
> created (such as if I delete/rename/move an existing .rrd file), it
> immediately starts off being corrupted. :-(
>
> On Nov 28, 2007 10:08 AM, Gary Baluha <gumby3203 (at) gmail.com> wrote:
>
> > I have recently noticed a strange thing happening with some of the rrd
> > graphs generated by Hobbit. When you look at the graph, it looks as though
> > the rrd data is one one format (gauge), but the graph is generating it in a
> > different format (derive). I can't seem to find any pattern to the hosts or
> > tests that are exhibiting this strange behavior, and it is only happening on
> > a handful of graphs. I have attached a picture of one of these graphs,
> > since I'm not really sure how to describe it. Note the huge numbers
> > displayed on the curr/min/avg/max line.
> >
> > Any idea what's going on here? When I dump the RRD file manually,
> > everything looks okay. I'm running Hobbit 4.2.0 with the 2007-02-09
> > allinone patch (I believe the latest). This has only happened in the past
> > few weeks, though when exactly it started, I don't know. Any ideas?
> >
>
>