[hobbit] CPU utilisation alerts

vernon.everett at westernpower.com.au vernon.everett at westernpower.com.au
Thu Sep 13 07:39:37 CEST 2007


Hi Henrik

I have been thinking about this problem, and was wondering how difficult 
it would be to incorporate a CPUU (CPU Utilisation) test as a standard for 
the bb-hosts config.
The graph already exists (la1) so the data is already being collected.
What would be needed to make it a standard test? 

Regards
     Vernon

Vernon Everett/PER/Western_Power at Western_Power wrote on 13/09/2007 
01:05:46 PM:

> 
> So we both want a basic CPU utilisation alert. 
> Cool. 
> Hopefully somebody on the list has done this before. 
> If not, it's time to do a bit of scripting. 
> If I have to do it myself, I will post the results, if you are 
interested. 
> 
> Regards 
>      Vernon 
> 
> "Kern, Thomas" <Thomas.Kern at hq.doe.gov> wrote on 13/09/2007 12:56:08 PM:
> 
> > On the mainframes, we are used to lots of tasks waiting for various 
> > resources. Main memory, virtual memory, I/O, etc all are important 
> > and can be tuned fairly well. When that tuning isn't quite right, 
> > throughput can be degraded but the pain is not as bad as when the 
> > CPU gets overloaded. CPU is one resource that is hard to change, I 
> > have run systems that needed to be less than 50% busy and others 
> > that the boss wanted at 100% and wished for 110% busy.
> > 
> > The CPU value can also be the first indication of a runaway user or 
> > bad database query (this is our most common problem). Once we know 
> > from the CPU utilization that something is wrong, we can look for 
> > the cause and maybe the other problems are there too.
> > 
> > 
> > Thomas Kern
> > 301-903-2211
> > 
> > 
> > ----- Original Message -----
> > From: vernon.everett at westernpower.com.au <vernon.
> everett at westernpower.com.au>
> > To: hobbit at hswn.dk <hobbit at hswn.dk>
> > Sent: Thu Sep 13 00:23:43 2007
> > Subject: Re: [hobbit] CPU utilisation alerts
> > 
> > 
> > It might be different in mainframe world, but in Unix world, you 
> > need to look at both the run queue length, IO stats and the CPU 
> > utilisation to get an idea of what's happening.
> > If your CPU is at 100% and your run queue is still small, it's 
> > probably just a hefty process chugging along, like a compile.
> > If your run queue is huge, and growing, and your CPU isn't yet at 
> > 100% you need to look at your IO. Disk, memory, swap, any resource 
> > that could be generating contention and IO wait.
> > If there is major contention for these resources you need to look at
> > adding more, or utilising them differently - spread data across 
> > multiple disks or mirror the disk to increase read throughput, that 
> > sort of thing.
> > If your run queue is huge, and growing, and CPU is at 100%, while IO
> > is low, it's probably time to move to a new server, or find the 
> > developer and tell him to fix his bugs. :-)
> > 
> > So absolute CPU utilisation on its own, isn't particularly 
> > meaniingful, but if that's what the PHBs want, let's give it to them.
> > 
> > Regards
> >      Vernon
> > 
> > 
> > "Kern, Thomas" <Thomas.Kern at hq.doe.gov> wrote on 13/09/2007 12:07:37 
PM:
> > 
> > > I would prefer that the cpu test be the data from the vmstat command
> > > instead of the load values. I am used to a mainframe system and cpu
> > > utilization is more useful that queue length. All of my Linux
> > > systems are guests on a mainframe system so their individual cpu
> > > utilizations is not as important as the values from my first level
> > > system and I am working on a client side test for that.
> > >
> > >
> > > Thomas Kern
> > > 301-903-2211
> > >
> > >
> > > ----- Original Message -----
> > > From: vernon.everett at westernpower.com.au <vernon.
> > everett at westernpower.com.au>
> > > To: hobbit at hswn.dk <hobbit at hswn.dk>
> > > Sent: Wed Sep 12 23:56:43 2007
> > > Subject: Re: [hobbit] CPU utilisation alerts
> > >
> > >
> > > Hi Thomas
> > >
> > > Thanks for your quick response.
> > >
> > > A client side script would work, but I was thinking I cannot be the
> > > first person to need this, and that somebody else has already
> > > invented the wheel.
> > > (I hate reinventing stuff)
> > > Alternatively, I was hoping that Henrik has some magic switch or
> > > config setting that will make it work.
> > >
> > > Regards
> > >        Vernon
> > >
> > >
> > > "Kern, Thomas" <Thomas.Kern at hq.doe.gov> wrote on 13/09/2007 11:46:07 
AM:
> > >
> > > > I don't know if you can alert off one of the values in one of the
> > > > trends graphs. That might take some back-end modifications.
> > > >
> > > > But you could write a simple client-side script to do the same
> > > > command that is parsed for the trends graphs (vmstat, I think),
> > > > totaling the cpu utilization values and sending a simple status
> > > > message with the appropriate g/y/r color. The hobbit can do the 
alert.
> > > >
> > > >
> > > > Thomas Kern
> > > > 301-903-2211
> > > >
> > > >
> > > > ----- Original Message -----
> > > > From: vernon.everett at westernpower.com.au <vernon.
> > > everett at westernpower.com.au>
> > > > To: hobbit at hswn.dk <hobbit at hswn.dk>
> > > > Sent: Wed Sep 12 23:36:07 2007
> > > > Subject: [hobbit] CPU utilisation alerts
> > > >
> > > >
> > > > Hi all
> > > >
> > > > I'm baaaack :-)
> > > > For those who might have missed me, I spent a few months 
contracting
> > > > for a company that standardised on BMC Patrol. Wouldn't even 
> > look at Hobbit.
> > > > BMC is a horrible package, expensive, not very extensible, with a
> > > > huge client footprint and overhead, and is very prone to crashing.
> > > > Sad product.
> > > >
> > > > But no matter, I am now trying to satisfy my new company that 
Hobbit
> > > > is the one monitor to rule them all, and my new colleagues have
> > > > identified a "deficiency".
> > > >
> > > > This has probably been asked and answered before, but here is 
> > whatthey want.
> > > > I have been asked to generate a yellow/red status when absolute 
CPU
> > > > utilisation reaches predetermined thresholds.
> > > > Yes, I know, without looking at the run-queue this figure is not
> > > > very meaningful, but this is what they want.
> > > >
> > > > The la1 graph in the trends column does an excellent job of 
graphing
> > > > the CPU utilisation, but how do I configure an alert based on 
> that figure?
> > > >
> > > > Regards
> > > >        Vernon
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > 
========================================================================
> > > > Electricity Networks Corporation, trading as Western Power
> > > > ABN: 18 540 492 861
> > > >
> > > > TO THE ADDRESSEE - this email is for the intended addressee only 
and
> > > > may contain information that is confidential.
> > > > If you have received this email in error, please notify us
> > > > immediately by return email or by telephone.
> > > > Please also destroy this message and any electronic or hard copies
> > > > of this message.
> > > >
> > > > Any claim to confidentiality is not waived or lost by reason of
> > > > mistaken transmission of this email.
> > > >
> > > > Unencrypted email is not secure and may not be authentic.  Western
> > > > Power cannot guarantee the accuracy, reliability,
> > > > completeness or confidentiality of this email and any attachments.
> > > >
> > > > VIRUSES - Western Power scans all outgoing emails and attachments
> > > > for viruses, however it is the recipient's responsibility
> > > > to ensure this email is free of viruses.
> > > > 
> > 
======================================================================== 
> > >
> > >
> > >
> > > 
========================================================================
> > > Electricity Networks Corporation, trading as Western Power
> > > ABN: 18 540 492 861
> > >
> > > TO THE ADDRESSEE - this email is for the intended addressee only and
> > > may contain information that is confidential.
> > > If you have received this email in error, please notify us
> > > immediately by return email or by telephone.
> > > Please also destroy this message and any electronic or hard copies
> > > of this message.
> > >
> > > Any claim to confidentiality is not waived or lost by reason of
> > > mistaken transmission of this email.
> > >
> > > Unencrypted email is not secure and may not be authentic.  Western
> > > Power cannot guarantee the accuracy, reliability,
> > > completeness or confidentiality of this email and any attachments.
> > >
> > > VIRUSES - Western Power scans all outgoing emails and attachments
> > > for viruses, however it is the recipient's responsibility
> > > to ensure this email is free of viruses.
> > > 
> ======================================================================== 
 
> > 
> > 
> > 
> > 
========================================================================
> > Electricity Networks Corporation, trading as Western Power
> > ABN: 18 540 492 861
> > 
> > TO THE ADDRESSEE - this email is for the intended addressee only and
> > may contain information that is confidential.
> > If you have received this email in error, please notify us 
> > immediately by return email or by telephone.
> > Please also destroy this message and any electronic or hard copies 
> > of this message.
> > 
> > Any claim to confidentiality is not waived or lost by reason of 
> > mistaken transmission of this email.
> > 
> > Unencrypted email is not secure and may not be authentic.  Western 
> > Power cannot guarantee the accuracy, reliability,
> > completeness or confidentiality of this email and any attachments.
> > 
> > VIRUSES - Western Power scans all outgoing emails and attachments 
> > for viruses, however it is the recipient's responsibility
> > to ensure this email is free of viruses.
> > 
> ======================================================================== 
 

> 
> 
> ========================================================================
> Electricity Networks Corporation, trading as Western Power
> ABN: 18 540 492 861
> 
> TO THE ADDRESSEE - this email is for the intended addressee only and
> may contain information that is confidential. 
> If you have received this email in error, please notify us 
> immediately by return email or by telephone. 
> Please also destroy this message and any electronic or hard copies 
> of this message.
> 
> Any claim to confidentiality is not waived or lost by reason of 
> mistaken transmission of this email.
> 
> Unencrypted email is not secure and may not be authentic.  Western 
> Power cannot guarantee the accuracy, reliability,
> completeness or confidentiality of this email and any attachments.
> 
> VIRUSES - Western Power scans all outgoing emails and attachments 
> for viruses, however it is the recipient's responsibility 
> to ensure this email is free of viruses.
> ========================================================================
========================================================================
Electricity Networks Corporation, trading as Western Power
ABN: 18 540 492 861

TO THE ADDRESSEE - this email is for the intended addressee only and may contain information that is confidential. 
If you have received this email in error, please notify us immediately by return email or by telephone. 
Please also destroy this message and any electronic or hard copies of this message.

Any claim to confidentiality is not waived or lost by reason of mistaken transmission of this email.

Unencrypted email is not secure and may not be authentic.  Western Power cannot guarantee the accuracy, reliability,
completeness or confidentiality of this email and any attachments.

VIRUSES - Western Power scans all outgoing emails and attachments for viruses, however it is the recipient's responsibility 
to ensure this email is free of viruses.
========================================================================
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20070913/3d98b697/attachment.html>


More information about the Xymon mailing list