[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [hobbit] Users / Procs Graphing problem



On Wed, Feb 01, 2006 at 12:27:48PM -0000, Rowell, Mike wrote:
> I'm building some monitoring solutions at a place I'm working and during
> this have noticed an issue with one of the rrd graphs that I can't
> figure out.. Everyday at a specific time (different for each system)
> there is a 10minute gap in the graph, I've done an rrd dump on the data
> and it's appearing as a NaN entry.

This means that no data was being fed into the RRD file for 10-15
minutes.

> This is only effecting the users / procs graph, anyone got any ideas?

Could it be that you are rebooting these servers once a day ?
(I know, Unix folks rarely do that - but just in case).

Is this with the Hobbit client, or the BB client reporting data ?

Since it always happens on the same time for a given server, it would be
interesting to see what messages are fed into Hobbit around that time.
If you are running the Hobbit client, could you setup a cron job to
fetch the client data around that time ? It should run something like

   wget http://hobbitserver/cgi-bin/bb-hostsvc.sh?CLIENT=bad.client.name

and store the output in a file where you can look at it later. Best 
thing would be if you could run this every minute for 15 minutes around
the time this problem occurs.

If you are running the BB client, the interesting part is the "cpu"
column data that is sent around that time. So something similar,
except that the URL you should fetch is

http://hobbitserver/cgi-bin/bb-hostsvc.sh?HOSTSVC=bad,client,name.cpu


Regards,
Henrik