[hobbit] RE: CPU Utilization Part 2 -- HELP!!!

James Wade jkwade at futurefrontiers.com
Fri Mar 9 20:41:00 CET 2007


Hi Greg,

 

Yes, it does come from the vmstat. Hobbit does a 5 minute average

using vmstat. I think this is to long and perhaps doesn't work as well

on Solaris. I suspect that when you get a system that gets overloaded,

the 5 minute average is taking a while to complete, almost hanging.

 

I believe that a better method of CPU utilization would be to take 15 second

averages over 2 minutes, ie. 8 data samples, and put that in hobbit.

I've tried to do this, but I have not had much luck. I can't seem to get the

rrd correct.

 

What I did do as a test was to change the 300 second average in

the hobbitclient-sunos.sh file to a 15 second sample, and I was able

to see better indications of CPU utilization. I know we can debate a

15 second CPU utilization average verses a 5 minute, but what I've

seen is the 5 minute average just isn't working on about 100 Solaris

boxes. It stays flatline at a low percentage when the CPU is max'd.

 

However, having a 15 second average every 5 minutes isn't going to

do it either because you miss more than 4 minutes until hobbit

runs again and takes another 15 second average.

 

This has become a major issue here because they want to compare

CPU Utilization with the number of transactions in the Application

log file.

 

I could really use a work around.

 

Thanks.James

  _____  

From: Hubbard, Greg L [mailto:greg.hubbard at eds.com] 
Sent: Friday, March 09, 2007 1:15 PM
To: hobbit at hswn.dk
Subject: RE: [hobbit] RE: CPU Utilization Part 2 -- HELP!!!

 

James, I suspect that the CPU utilization numbers come from "vmstat" output.
There should be a section in the client data labeled [vmstat] and the
numbers that are graphed are listed in the far right hand columns.  I guess
the [iostatcpu] section could be used as well -- that adds the "wt" column
to "us, sy, and id".

 

On one of my Solaris systems, I have something that looks like this:

 

[vmstat]
kthr      memory            page            disk          faults      cpu
r b w   swap  free  re  mf pi po fr de sr s0 s1 s3 --   in   sy   cs us sy
id
0 0 0 22683696 6595936 41 257 1 0 0  0  0  3  2  0  0  573 1703  631  1  2
97
0 0 0 22077552 5982992 38 240 0 0 0  0  0  3  3  0  0  608 3125  839  2  2
97
[iostatcpu]
     cpu
us sy wt id
  1  2  0 97
  2  2  0 97

It is possible that your vmstat output is not formatted the way that Hobbit
expects (extra columns?)  Your graph looks different -- usually the CPU
utilization graph for Solaris is a stacked area chart, not a line graph.

 

Others may be able to chime in with more things to consider.  Good luck!

 

GLH

 


  _____  


From: James Wade [mailto:jkwade at futurefrontiers.com] 
Sent: Friday, March 09, 2007 1:05 PM
To: hobbit at hswn.dk
Subject: RE: [hobbit] RE: CPU Utilization Part 2 -- HELP!!!

The Operating System is Solaris.

The graph definitions have not been changed.

 

What should I look for with the client data?
I'm not familiar with rrd.

 

Although if I look at all the clients overall, I see

the same thing, the averaging of CPU Utilization

is a flat-line on everything, even during peak loads.

 

Thanks.James

 


  _____  


From: Hubbard, Greg L [mailto:greg.hubbard at eds.com] 
Sent: Friday, March 09, 2007 12:53 PM
To: hobbit at hswn.dk
Subject: RE: [hobbit] RE: CPU Utilization Part 2 -- HELP!!!

 

What OS, version, etc.  And did you look into your client data file to see
what is in there?  And has anyone monkeyed with the graph definitions?

 

You have the smoking gun, but that's about all we have to work on...

 


  _____  


From: James Wade [mailto:jkwade at futurefrontiers.com] 
Sent: Friday, March 09, 2007 12:49 PM
To: hobbit at hswn.dk
Subject: RE: [hobbit] RE: CPU Utilization Part 2 -- HELP!!!

I could not send all the graphs in the same email.

This is the same system with the other two graphs,

but this is a different monitoring tool another group

uses. It shows CPU Utilization at 100% which is

correct with a load of 120+. However, Hobbit, showed

a flatline at 13%..

 

I'm seeing this across the board on all the systems,

and the tool below is being used by the other group

showing the discrepancies. Any suggestions?

 

 

 

Thanks..James


  _____  


From: James Wade [mailto:jkwade at futurefrontiers.com] 
Sent: Friday, March 09, 2007 12:42 PM
To: hobbit at hswn.dk
Subject: [hobbit] RE: CPU Utilization -- HELP!!!

 

I really need some help on the CPU Utilization graphs.

They just don't look correct.

 

As an example, CPU Load on this box went to 120+,

for an hour, but the CPU Utilization Graph for the same

time period shows only 13% busy. 

 

James

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20070309/d5be6062/attachment.html>


More information about the Xymon mailing list