[Xymon] Improving memory monitoring
Mike Burger
mburger at bubbanfriends.org
Tue Apr 14 16:11:32 CEST 2015
Forgot to reply all.
On 2015-04-14 10:00 am, Mike Burger wrote:
> On 2015-04-14 7:56 am, Steve Hill wrote:
>> I'm working on improving my Xymon configuration to reduce the number
>> of false alerts that we get. In particular, memory monitoring is a
>> bit of a problem so I'm hoping someone will be able to offer some
>> advice.
>>
>> At the moment, Xymon is set up with something like:
>>
>> MEMPHYS 100 101
>> MEMSWAP 20 40
>> MEMACT 95 97
>>
>> I pretty much don't care about MEMPHYS. The problem with MEMSWAP and
>> MEMACT is that they work independently or each other - i.e. the above
>> will give me an alert if > 97% of the RAM is used OR > 40% of swap is
>> used.
>>
>> However, this results in warnings for systems that have a lot of idle
>> data in memory. The Linux kernel will page out idle data (increasing
>> swap usage and reducing RAM usage) and use that space for
>> buffers/caches, and this is a very sensible strategy. Unfortunately,
>> then Xymon comes along and notices that there's lots of swap in use
>> and throws an alert, even though there's plenty of RAM free.
>>
>> Basically, I don't care that a machine is 4GB into swap if it has 5GB
>> of free ram - that isn't a problem, it just means there's quite a lot
>> of idle data that the kernel has decided can be paged out. I do care
>> if it's 4GB into swap and only has 0.5GB of free RAM since this would
>> indicate that it's actually short of memory.
>>
>> What I really need is to warn if > x% of the RAM is used AND > y% of
>> swap is used - is there a way to do that?
>
> I'll say that I've never run into this...I've never had a system swap
> memory out to disk unless active memory was utilized at a high
> percentage...in either AIX or Linux.
>
> In AIX, there is some sort of algorithm in place where, if a process'
> memory has been swapped out and then swapped back in, the memory
> manager holds onto the paging space until either something else needs
> paging space or the previously swapped out process ends, but I don't
> think I've ever seen a situation in Linux where idle memory pages were
> swapped to disk and physical/active memory had some large percentage
> free.
>
> Now, on the other side of this, to take a stab at the question, I'd
> wager that, at present, you'd need to script such a test/alert..but I
> would agree that it would be useful to be able to set an "alarm if
> this or this" or an "alarm if this and this" type scenario. At
> present, the only tests I can think of that allow this, "out of the
> box" are the process monitors, where you can set minimum and maximum
> thresholds.
--
Mike Burger
http://www.bubbanfriends.org
"It's always suicide-mission this, save-the-planet that. No one ever
just stops by to say 'hi' anymore." --Colonel Jack O'Neill, SG1
More information about the Xymon
mailing list