[hobbit] CPU utilisation alerts
Kern, Thomas
Thomas.Kern at hq.doe.gov
Thu Sep 13 06:56:08 CEST 2007
On the mainframes, we are used to lots of tasks waiting for various resources. Main memory, virtual memory, I/O, etc all are important and can be tuned fairly well. When that tuning isn't quite right, throughput can be degraded but the pain is not as bad as when the CPU gets overloaded. CPU is one resource that is hard to change, I have run systems that needed to be less than 50% busy and others that the boss wanted at 100% and wished for 110% busy.
The CPU value can also be the first indication of a runaway user or bad database query (this is our most common problem). Once we know from the CPU utilization that something is wrong, we can look for the cause and maybe the other problems are there too.
Thomas Kern
301-903-2211
----- Original Message -----
From: vernon.everett at westernpower.com.au <vernon.everett at westernpower.com.au>
To: hobbit at hswn.dk <hobbit at hswn.dk>
Sent: Thu Sep 13 00:23:43 2007
Subject: Re: [hobbit] CPU utilisation alerts
It might be different in mainframe world, but in Unix world, you need to look at both the run queue length, IO stats and the CPU utilisation to get an idea of what's happening.
If your CPU is at 100% and your run queue is still small, it's probably just a hefty process chugging along, like a compile.
If your run queue is huge, and growing, and your CPU isn't yet at 100% you need to look at your IO. Disk, memory, swap, any resource that could be generating contention and IO wait.
If there is major contention for these resources you need to look at adding more, or utilising them differently - spread data across multiple disks or mirror the disk to increase read throughput, that sort of thing.
If your run queue is huge, and growing, and CPU is at 100%, while IO is low, it's probably time to move to a new server, or find the developer and tell him to fix his bugs. :-)
So absolute CPU utilisation on its own, isn't particularly meaniingful, but if that's what the PHBs want, let's give it to them.
Regards
Vernon
"Kern, Thomas" <Thomas.Kern at hq.doe.gov> wrote on 13/09/2007 12:07:37 PM:
> I would prefer that the cpu test be the data from the vmstat command
> instead of the load values. I am used to a mainframe system and cpu
> utilization is more useful that queue length. All of my Linux
> systems are guests on a mainframe system so their individual cpu
> utilizations is not as important as the values from my first level
> system and I am working on a client side test for that.
>
>
> Thomas Kern
> 301-903-2211
>
>
> ----- Original Message -----
> From: vernon.everett at westernpower.com.au <vernon.everett at westernpower.com.au>
> To: hobbit at hswn.dk <hobbit at hswn.dk>
> Sent: Wed Sep 12 23:56:43 2007
> Subject: Re: [hobbit] CPU utilisation alerts
>
>
> Hi Thomas
>
> Thanks for your quick response.
>
> A client side script would work, but I was thinking I cannot be the
> first person to need this, and that somebody else has already
> invented the wheel.
> (I hate reinventing stuff)
> Alternatively, I was hoping that Henrik has some magic switch or
> config setting that will make it work.
>
> Regards
> Vernon
>
>
> "Kern, Thomas" <Thomas.Kern at hq.doe.gov> wrote on 13/09/2007 11:46:07 AM:
>
> > I don't know if you can alert off one of the values in one of the
> > trends graphs. That might take some back-end modifications.
> >
> > But you could write a simple client-side script to do the same
> > command that is parsed for the trends graphs (vmstat, I think),
> > totaling the cpu utilization values and sending a simple status
> > message with the appropriate g/y/r color. The hobbit can do the alert.
> >
> >
> > Thomas Kern
> > 301-903-2211
> >
> >
> > ----- Original Message -----
> > From: vernon.everett at westernpower.com.au <vernon.
> everett at westernpower.com.au>
> > To: hobbit at hswn.dk <hobbit at hswn.dk>
> > Sent: Wed Sep 12 23:36:07 2007
> > Subject: [hobbit] CPU utilisation alerts
> >
> >
> > Hi all
> >
> > I'm baaaack :-)
> > For those who might have missed me, I spent a few months contracting
> > for a company that standardised on BMC Patrol. Wouldn't even look at Hobbit.
> > BMC is a horrible package, expensive, not very extensible, with a
> > huge client footprint and overhead, and is very prone to crashing.
> > Sad product.
> >
> > But no matter, I am now trying to satisfy my new company that Hobbit
> > is the one monitor to rule them all, and my new colleagues have
> > identified a "deficiency".
> >
> > This has probably been asked and answered before, but here is whatthey want.
> > I have been asked to generate a yellow/red status when absolute CPU
> > utilisation reaches predetermined thresholds.
> > Yes, I know, without looking at the run-queue this figure is not
> > very meaningful, but this is what they want.
> >
> > The la1 graph in the trends column does an excellent job of graphing
> > the CPU utilisation, but how do I configure an alert based on that figure?
> >
> > Regards
> > Vernon
> >
> >
> >
> >
> >
> >
> >
> > ========================================================================
> > Electricity Networks Corporation, trading as Western Power
> > ABN: 18 540 492 861
> >
> > TO THE ADDRESSEE - this email is for the intended addressee only and
> > may contain information that is confidential.
> > If you have received this email in error, please notify us
> > immediately by return email or by telephone.
> > Please also destroy this message and any electronic or hard copies
> > of this message.
> >
> > Any claim to confidentiality is not waived or lost by reason of
> > mistaken transmission of this email.
> >
> > Unencrypted email is not secure and may not be authentic. Western
> > Power cannot guarantee the accuracy, reliability,
> > completeness or confidentiality of this email and any attachments.
> >
> > VIRUSES - Western Power scans all outgoing emails and attachments
> > for viruses, however it is the recipient's responsibility
> > to ensure this email is free of viruses.
> > ========================================================================
>
>
>
> ========================================================================
> Electricity Networks Corporation, trading as Western Power
> ABN: 18 540 492 861
>
> TO THE ADDRESSEE - this email is for the intended addressee only and
> may contain information that is confidential.
> If you have received this email in error, please notify us
> immediately by return email or by telephone.
> Please also destroy this message and any electronic or hard copies
> of this message.
>
> Any claim to confidentiality is not waived or lost by reason of
> mistaken transmission of this email.
>
> Unencrypted email is not secure and may not be authentic. Western
> Power cannot guarantee the accuracy, reliability,
> completeness or confidentiality of this email and any attachments.
>
> VIRUSES - Western Power scans all outgoing emails and attachments
> for viruses, however it is the recipient's responsibility
> to ensure this email is free of viruses.
> ========================================================================
========================================================================
Electricity Networks Corporation, trading as Western Power
ABN: 18 540 492 861
TO THE ADDRESSEE - this email is for the intended addressee only and may contain information that is confidential.
If you have received this email in error, please notify us immediately by return email or by telephone.
Please also destroy this message and any electronic or hard copies of this message.
Any claim to confidentiality is not waived or lost by reason of mistaken transmission of this email.
Unencrypted email is not secure and may not be authentic. Western Power cannot guarantee the accuracy, reliability,
completeness or confidentiality of this email and any attachments.
VIRUSES - Western Power scans all outgoing emails and attachments for viruses, however it is the recipient's responsibility
to ensure this email is free of viruses.
========================================================================
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20070913/f58605fa/attachment.html>
More information about the Xymon
mailing list