[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [hobbit] Out of range memory report



Henrik Stoerner wrote:

On Thu, Aug 03, 2006 at 11:37:02AM +0100, Colin Spargo wrote:


I got this in the "memory" column for a Solaris 8 host this morning, which caused it to go red (even though i have the threshold set to 101).

Thu Aug 3 09:11:17 BST 2006 - Memory CRITICAL
  Memory              Used       Total  Percentage
red Physical     4294955003M     131072M 4294967287%
green Swap              40973M     144024M         28%


That physical memory calculation is obviously incorrect!



Yep, but the data it got were weird. Colin sent me some additional data from the client message. The interesting bits are here:

The Solaris prtconf command is used to determine the amount of RAM
in the box. Here is says:

[prtconf]
System Configuration:  Sun Microsystems  sun4u
Memory size: 131072 Megabytes

So this box has 131072 MB. (128 GB - a lot, I might add. Is this really
true?)

The command "vmstat 1 2|tail -1" is used to grab the current memory
usage:

[memory]
0 0 0 211046168 146806184 744 6249 0 0 0 0 0 0 0 0  0 2692 436955 11454 8 6 86

Column 5 is the "free memory" column in KB, here: 146806184 KB. Divide
by 1024 to get MB, and it gives 143365 MB free.

Now ... how can a box with 131072 MB RAM end up with 143365 MB free ?
That's almost 12 GB more than what is physically installed in the box.

Hobbit then gets a negative value for the amount of memory used, and
because it is then used in a calculation with some unsigned variables
it blows up and comes up with this hilarious value of the amount of
memory used.


Now, I'll admit that Hobbit should probably do a sanity check on the data so it doesn't trigger alerts in these circumstances. But the core problem is that your box is reporting some weird data.


Regards, Henrik



It sounds to me like vmstat is reporting the total memory as "real + disk"...which I believe is what the 4th column shows (I don't have a solaris server where I'm currently at to confirm). So, while hobbit keys on the physical (real) memory to determine the memory, it's keying on a computed value to do the differential. I've seen this under HPUX and linux on occasion for the memory tests for BB. I've usually just gone in and tweaked the scripts when that wouls happen.

=G=