[hobbit] Problems running on a Solaris Zone
Dominique Frise
dominique.frise at unil.ch
Wed Feb 24 07:24:39 CET 2010
Agree with this.
We use NOCOLUMNS:cpu,disk,memory for all our sparse zone systems.
Dominique
Vernon Everett wrote:
> Hi James
>
> I put a lot of effort into this recently, and there does not appear to
> be any real practical solution to the problem.
> The problem is caused by how zones use memory and kernel space.
>
> In sparse zones, all kernels are the same kernel. There is only one
> instance of the kernel running, and as a result, only one chunk of
> memory visible to the kernel.
>
> When you set a memory cap in your zone definition, and do a prtconf in
> the zone, it reports the value of the memory cap as the available memory.
> So far, so good.
>
> However, to determine free memory, we have to interrogate the kernel.
> This can be done a number of different ways. Xymon, by default uses vmstat.
> You can also use kstat -p unix:0:system_pages:freemem and I am sure
> there are others.
> However, the kernel in question, is the kernel running in the global zone!
> It's all one kernel.
> So the reported memory free is the free memory available to the kernel.
> It should be the same value in all the zones too.
>
> The error you are seeing occurs when free memory available to the global
> kernel is more than the memory cap you have placed on the zone.
> In C (and many other programming languages), if you subtract big numbers
> from smaller numbers, you sometimes get strange results depending on how
> your variables are defined. I think that's where your multi-Petabyte
> memory is coming from. Any programmers out there that can confirm this?
>
> The other problem this creates, is that any sane-looking zone memory
> percentages are meaningless. They do not represent the true memory
> utilisation within the zone. Your zone memory utilisation could be 100%,
> and you would not realise it, because your kernel is still seeing heaps
> of free memory, and reporting lots free.
> Imagine a 2gb cap, and the apps in the zone are using all 2gb.
> However, the kernel can see 1.8gb free.
> Do the maths. Xymon tells us your zone is only using 10% of memory,
> which is far from the truth.
>
> The only real way round it might not fit with your policies and methods.
> You need to remove all memory caps.
> This floats all memory, meaning that the memory "seen" in the zone, is
> the same as the kernel, and Solaris does the management of memory,
> ensuring all zones get enough.
> It also means that all of the zones will show identical memory graphs.
>
> The other way, which I haven't had time to do yet, is to use prstat -Z
> in the global zone.
> This gives a summary of what the zones are using, which might be worth
> tracking.
>
> As a short-term workaround, because we need memory caps for certain
> apps, we have skipped memory monitoring on the zones. (It's pretty
> meaningless anyway - see above)
> We have the global zone, and below it, all the zones, with the
> NOCOLUMNS:memory bb-hosts tag.
>
> It's not really ideal, but I hope to find time to revisit this in the
> near future.
>
> It would be nice to be able to disable just the memory test on these,
> and only keep an eye on swap. Swap is local to the zone, and if you
> start using heaps of it in the zone, or are doing lots of paging,
> chances are you are maxing out your memory allocation.
> So swap is probably a good indicator.
>
> Sorry I could not be of any more help.
>
> Regards
> Vernon
>
>
>
>
> On Wed, Feb 24, 2010 at 1:35 AM, James Wade <jkwade at futurefrontiers.com
> <mailto:jkwade at futurefrontiers.com>> wrote:
>
>
> *Has anyone see this problem. I’ve just compiled 4.3.0.0.beta2
> on a
> Solaris 10 system. I’m running on a Sun T5120 series in a Solaris
> sparse zone. *
>
>
> *When I run the server, I get the following on the memory test.
> Fyi.. I don’t have 4.2 peta bytes of memory *J
>
>
> *Has anyone seen similar problems. Running the client in the
> global zone works fine.*
>
>
> *Tue Feb 23 10:52:43 CST 2010 - Memory CRITICAL*
>
> Memory Used Total Percentage
>
> red Physical 4294966186M 26624M 4294967292%
>
> green Swap 148M 26623M 0%
>
>
>
>
>
>
>
> Thanks…James
>
>
More information about the Xymon
mailing list