[hobbit] [feature request] trackmax for all parameters

Henrik Stoerner henrik at hswn.dk
Fri Jan 19 13:23:57 CET 2007


On Fri, Jan 19, 2007 at 10:55:05AM +0100, Charles Goyard wrote:

> Gildas Le Nadan wrote :
> > 
> > The TRACKMAX feature is really interesting as the max values get 
> > "diluted" in the week/month/year views.
> > 
> > Alas, it seem to only work for NCV values at the moment (and I wish I 
> > could use it for cpu/memory/network/disk).
> > 
> > Charles, Henrik, do you think this is feasible?
> 
> I agree it would be nice. 

It would. Only problem is that it would increase the size of the RRD
files (perhaps not a big issue), and it would only have effect on new
RRD files that are created, not any existing ones. So to add this to
an existing setup, you would have to dump all of your RRD files, then
create new ones, and import the data from the dump. Not sure how much 
work would be involved in automating this.

However, there has also been some requests to increase the granularity 
of the stored data, e.g. to keep 7 days worth of 5-minute averages as 
opposed to the current 2 days. And likewise for the other RRA's.

(For those not familiar with how RRDtool works, the RRA's define how
 each measurement gets averaged over time. Hobbit uses 4 RRA's in each
 RRD file - one RRA tracks data (almost) without averaging it, using
 the 5-minute interval; the next averages across 6 measurements 
 - 30 minutes; the third averages across 24 measurements - 2 hours;
 the last averages across 288 measurements - 1 day. For each of these
 4 groups of data-averages, the RRD file contains 576 values. So the 
 first RRA covers 576*5 minutes=2 days, the second covers 12 days,
 the third 48 days and the last 576 days. When Hobbit generates a graph 
 from the RRD data, RRDtool will automatically grab the data for the graph 
 from the best group of data which has data for the period requested).

So if we're going to add MAX/MIN tracking to the RRD-files, we might as
well do it at the same time that we change the granularity. The numbers 
I've been thinking of are to keep
   * 30 days of 5-minute averages
   * 90 days of 15-minute averages
   * 360 days on 1-hour averages
   * 1080 days of 3-hour averages

That alone would cause the size of the RRD files to increase 15 times.
Adding MIN+MAX tracking would mean tripling the size. An "average"
host in my setup uses ~400 KB of diskspace for RRD-files, so increasing
that 15x3 times means it would grow to ~16 MB per host. It's a
significant increase (I'd have to get more disk space for my production
systems to handle that), but disks are getting bigger and cheaper - 
and I'd still be storing data for 4000 hosts on less than 60 GB.

An simple RRD file would grow from ~19 KB to 855 KB.

I don't think there would be much of a performance hit from this. The
RRD update spends most of its time opening and locking the file, whereas
the actual data-update doesn't take long.


> I have a question for Henrik prior to implement it. The RRAS are added 
> individually in each /rrd/*.c backend,
> and they get calculated for every status message coming in

Actually, they don't. All of the rrd/*.c files use logic like this:

  static char *la_params[] = { "rrdcreate", rrdfn, "DS:la:GAUGE:600:0:U", 
                               rra1, rra2, rra3, rra4, NULL };
  static char *la_tpl = NULL;

  if (la_tpl == NULL) la_tpl = setup_template(la_params);

The keyword here is the declaration of the variables as "static".

The "la_params[]" is a static table, so this is initialized at
compile-time with the static values. "la_tpl" is the RRD "template",
which basically is a list of the dataset names in the order which
Hobbit is feeding data values in. This template is only calculated
the first time this type of RRD file is updated - that's what the
"if (la_tpl == NULL) ... " does. 

Inside the create_and_update_rrd() function, the "la_params" 
with the RRA's are only used when creating a new RRD.



Regards,
Henrik




More information about the Xymon mailing list