[hobbit] Users / Procs Graphing problem

Rowell, Mike Mike.Rowell at viatel.com
Wed Feb 1 14:12:32 CET 2006


Henrik,

Thanks for this, the data is actually coming via my own modified client, the
data is correct before anyone asks.  What I've done is taken the Nagios
client as we have extensive nagios monitoring, added a couple of new checks
to it and written an ext script that connects to the specified servers, runs
the new checks and then sticks the data into Hobbit.

We have multiple environments some firewalls some not etc etc so it's the
best allround solution I could come up with.

I'll pull the data at the client side and see if I can tell whats going
wrong.

Regards

Mike Rowell 

-----Original Message-----
From: Henrik Stoerner [mailto:henrik at hswn.dk] 
Sent: 01 February 2006 13:02
To: hobbit at hswn.dk
Subject: Re: [hobbit] Users / Procs Graphing problem

On Wed, Feb 01, 2006 at 12:27:48PM -0000, Rowell, Mike wrote:
> I'm building some monitoring solutions at a place I'm working and 
> during this have noticed an issue with one of the rrd graphs that I 
> can't figure out.. Everyday at a specific time (different for each 
> system) there is a 10minute gap in the graph, I've done an rrd dump on 
> the data and it's appearing as a NaN entry.

This means that no data was being fed into the RRD file for 10-15 minutes.

> This is only effecting the users / procs graph, anyone got any ideas?

Could it be that you are rebooting these servers once a day ?
(I know, Unix folks rarely do that - but just in case).

Is this with the Hobbit client, or the BB client reporting data ?

Since it always happens on the same time for a given server, it would be
interesting to see what messages are fed into Hobbit around that time.
If you are running the Hobbit client, could you setup a cron job to fetch
the client data around that time ? It should run something like

   wget http://hobbitserver/cgi-bin/bb-hostsvc.sh?CLIENT=bad.client.name

and store the output in a file where you can look at it later. Best thing
would be if you could run this every minute for 15 minutes around the time
this problem occurs.

If you are running the BB client, the interesting part is the "cpu"
column data that is sent around that time. So something similar, except that
the URL you should fetch is

http://hobbitserver/cgi-bin/bb-hostsvc.sh?HOSTSVC=bad,client,name.cpu


Regards,
Henrik


To unsubscribe from the hobbit list, send an e-mail to
hobbit-unsubscribe at hswn.dk




For more information about the Viatel Group, please visit www.viatel.com

THIS MESSAGE IS INTENDED ONLY FOR THE USE OF THE INTENDED RECIPIENT TO WHICH IT IS ADDRESSED AND MAY CONTAIN INFORMATION THAT IS PRIVILEGED, CONFIDENTIAL AND EXEMPT FROM DISCLOSURE.  If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering the message to the intended recipient, you are notified that any dissemination, distribution or copying of this e-mail is prohibited, and you should delete this e-mail from your system.

This message has been scanned for viruses and spam by Viatel MailControl - www.viatel.com



More information about the Xymon mailing list