[hobbit] Problems running on a Solaris Zone

Dominique Frise dominique.frise at unil.ch
Wed Feb 24 07:24:39 CET 2010


Agree with this.
We use NOCOLUMNS:cpu,disk,memory for all our sparse zone systems.

Dominique



Vernon Everett wrote:
> Hi James
> 
> I put a lot of effort into this recently, and there does not appear to 
> be any real practical solution to the problem.
> The problem is caused by how zones use memory and kernel space.
> 
> In sparse zones, all kernels are the same kernel. There is only one 
> instance of the kernel running, and as a result, only one chunk of 
> memory visible to the kernel.
> 
> When you set a memory cap in your zone definition, and do a prtconf in 
> the zone, it reports the value of the memory cap as the available memory.
> So far, so good.
> 
> However, to determine free memory, we have to interrogate the kernel. 
> This can be done a number of different ways. Xymon, by default uses vmstat.
> You can also use kstat -p unix:0:system_pages:freemem and I am sure 
> there are others.
> However, the kernel in question, is the kernel running in the global zone!
> It's all one kernel.
> So the reported memory free is the free memory available to the kernel. 
> It should be the same value in all the zones too.
> 
> The error you are seeing occurs when free memory available to the global 
> kernel is more than the memory cap you have placed on the zone.
> In C (and many other programming languages), if you subtract big numbers 
> from smaller numbers, you sometimes get strange results depending on how 
> your variables are defined. I think that's where your multi-Petabyte 
> memory is coming from. Any programmers out there that can confirm this?
> 
> The other problem this creates, is that any sane-looking zone memory 
> percentages are meaningless. They do not represent the true memory 
> utilisation within the zone. Your zone memory utilisation could be 100%, 
> and you would not realise it, because your kernel is still seeing heaps 
> of free memory, and reporting lots free.
> Imagine a 2gb cap, and the apps in the zone are using all 2gb.
> However, the kernel can see 1.8gb free.
> Do the maths. Xymon tells us your zone is only using 10% of memory, 
> which is far from the truth.
> 
> The only real way round it might not fit with your policies and methods.
> You need to remove all memory caps.
> This floats all memory, meaning that the memory "seen" in the zone, is 
> the same as the kernel, and Solaris does the management of memory, 
> ensuring all zones get enough.
> It also means that all of the zones will show identical memory graphs.
> 
> The other way, which I haven't had time to do yet, is to use prstat -Z 
> in the global zone.
> This gives a summary of what the zones are using, which might be worth 
> tracking.
> 
> As a short-term workaround, because we need memory caps for certain 
> apps, we have skipped memory monitoring on the zones. (It's pretty 
> meaningless anyway - see above)
> We have the global zone, and below it, all the zones, with the 
> NOCOLUMNS:memory bb-hosts tag.
> 
> It's not really ideal, but I hope to find time to revisit this in the 
> near future.
> 
> It would be nice to be able to disable just the memory test on these, 
> and only keep an eye on swap. Swap is local to the zone, and if you 
> start using heaps of it in the zone, or are doing lots of paging, 
> chances are you are maxing out your memory allocation.
> So swap is probably a good indicator.
> 
> Sorry I could not be of any more help.
> 
> Regards
>      Vernon
> 
> 
> 
> 
> On Wed, Feb 24, 2010 at 1:35 AM, James Wade <jkwade at futurefrontiers.com 
> <mailto:jkwade at futurefrontiers.com>> wrote:
> 
> 
>           *Has anyone see this problem. I’ve just compiled 4.3.0.0.beta2
>           on a
>           Solaris 10 system. I’m running on a Sun T5120 series in a Solaris
>           sparse zone. *
> 
> 
>           *When I run the server, I get the following on the memory test.
>           Fyi.. I don’t have 4.2 peta bytes of memory *J
> 
> 
>           *Has anyone seen similar problems. Running the client in the
>           global zone works fine.*
> 
> 
>           *Tue Feb 23 10:52:43 CST 2010 - Memory CRITICAL*
> 
>        Memory              Used       Total  Percentage
> 
>     red Physical     4294966186M      26624M 4294967292%
> 
>     green Swap                148M      26623M          0%
> 
>      
> 
>      
> 
>      
> 
>     Thanks…James
> 
> 



More information about the Xymon mailing list