[Xymon] Improving memory monitoring

Mike Burger mburger at bubbanfriends.org
Tue Apr 14 16:11:32 CEST 2015


Forgot to reply all.

On 2015-04-14 10:00 am, Mike Burger wrote:
> On 2015-04-14 7:56 am, Steve Hill wrote:
>> I'm working on improving my Xymon configuration to reduce the number
>> of false alerts that we get.  In particular, memory monitoring is a
>> bit of a problem so I'm hoping someone will be able to offer some
>> advice.
>> 
>> At the moment, Xymon is set up with something like:
>> 
>> MEMPHYS 100 101
>> MEMSWAP 20 40
>> MEMACT 95 97
>> 
>> I pretty much don't care about MEMPHYS.  The problem with MEMSWAP and
>> MEMACT is that they work independently or each other - i.e. the above
>> will give me an alert if > 97% of the RAM is used OR > 40% of swap is
>> used.
>> 
>> However, this results in warnings for systems that have a lot of idle
>> data in memory.  The Linux kernel will page out idle data (increasing
>> swap usage and reducing RAM usage) and use that space for
>> buffers/caches, and this is a very sensible strategy.  Unfortunately,
>> then Xymon comes along and notices that there's lots of swap in use
>> and throws an alert, even though there's plenty of RAM free.
>> 
>> Basically, I don't care that a machine is 4GB into swap if it has 5GB
>> of free ram - that isn't a problem, it just means there's quite a lot
>> of idle data that the kernel has decided can be paged out.  I do care
>> if it's 4GB into swap and only has 0.5GB of free RAM since this would
>> indicate that it's actually short of memory.
>> 
>> What I really need is to warn if > x% of the RAM is used AND > y% of
>> swap is used - is there a way to do that?
> 
> I'll say that I've never run into this...I've never had a system swap
> memory out to disk unless active memory was utilized at a high
> percentage...in either AIX or Linux.
> 
> In AIX, there is some sort of algorithm in place where, if a process'
> memory has been swapped out and then swapped back in, the memory
> manager holds onto the paging space until either something else needs
> paging space or the previously swapped out process ends, but I don't
> think I've ever seen a situation in Linux where idle memory pages were
> swapped to disk and physical/active memory had some large percentage
> free.
> 
> Now, on the other side of this, to take a stab at the question, I'd
> wager that, at present, you'd need to script such a test/alert..but I
> would agree that it would be useful to be able to set an "alarm if
> this or this" or an "alarm if this and this" type scenario. At
> present, the only tests I can think of that allow this, "out of the
> box" are the process monitors, where you can set minimum and maximum
> thresholds.

-- 
Mike Burger
http://www.bubbanfriends.org

"It's always suicide-mission this, save-the-planet that. No one ever 
just stops by to say 'hi' anymore." --Colonel Jack O'Neill, SG1




More information about the Xymon mailing list