[Xymon] Memory data handling from Windows clients
john.r.rothlisberger at accenture.com
john.r.rothlisberger at accenture.com
Thu Jan 28 21:04:26 CET 2016
We recently experienced a series of memory alerts on our windows servers that didn't make sense. Although Xymon was indicating that virtual memory was completely used, logging into the server itself showed there were no issues.
At first, I had thought there was something wrong with the data that the XymonPS client was sending and working with Zak Beck we found a small tidbit of code to resolve a "bbwin bug".
By flopping a 0 to a 1 we could reverse the way the numbers were reported and yet this didn't resolve the problem.
It became obvious that the data being sent from the XymonPS client was correct but once on the server something was getting handled differently.
Below is a detailed look at what we were seeing and then how I corrected it. The corrections may or may not want to be included in future code releases.
Here is what the XymonPS data looked like when we were getting alerts.
As you can see there is nothing that would indicate a problem:
[memory]
memory Total Used
physical: 4095 1266
virtual: 4506 4111 <--- These are the numbers we are interested in.
page: 8600 5297
This is what was the Xymon server used to create our alert:
Memory Used Total Percentage
Physical 1266M 4095M 30%
Actual 4111M >>> 4095M <<< 100% <--- Here is the alert
Swap 5297M 8600M 61%
As you can see - the "Total" available "Actual" memory is the same as the Total available Physical memory. But, this is not right, the total available Actual memory should have been 4506M.
Just for the sake of having this information included, here is what the memory data looks like from a linux client (there is no issue here):
Linux data:
[free]
total used free shared buffers cached
Mem: 16556808 4895408 11661400 0 259376 2508448
-/+ buffers/cache: 2127584 14429224
Swap: 1340412 0 1340412
Displayed:
Memory Used Total Percentage
Physical 4780M 16168M 29%
Actual 2077M 16168M 12%
Swap 0M 1308M 0%
So, I had to find out why the total physical memory was used to calculate the percentage of used space for the "Actual" memory when the client is a Windows server (XymonPS or BBWin).
What I found was that although there is a variable used for memactused there was nothing for memacttotal. So I added a new value for memacttotal which then required me to add the same values to bbwin.c (see below).
This also required a modification to linux.c (see below) which instead of using memacttotal, it simiply reususes the value for memphytotal which is essencially the same thing that is currently being done now.
Although I have provided patch files for bbwin.c and linux.c ALL of the xymond/client/*.c files need to be modified.
xymond_client.c patch:
--- xymond_client.c.ORIG 2016-01-27 09:53:58.321793949 -0600
+++ xymond_client.c 2016-01-28 12:57:51.676006980 -0600
@@ -909,7 +909,7 @@
void unix_memory_report(char *hostname, char *clientclass, enum ostype_t os,
void *hinfo, char *fromline, char *timestr,
- long memphystotal, long memphysused, long memactused,
+ long memphystotal, long memphysused, long memacttotal, long memactused,
long memswaptotal, long memswapused)
{
long memphyspct = 0, memswappct = 0, memactpct = 0;
@@ -938,7 +938,7 @@
if (memswappct > swapred) swapcolor = COL_RED;
}
- if (memactused != -1) memactpct = (memphystotal > 0) ? ((100 * memactused) / memphystotal) : 0;
+ if (memactused != -1) memactpct = (memacttotal > 0) ? ((100 * memactused) / memacttotal) : 0;
if (memactpct <= 100) {
if (memactpct > actyellow) actcolor = COL_YELLOW;
if (memactpct > actred) actcolor = COL_RED;
@@ -965,30 +965,30 @@
memorysummary);
addtostatus(msgline);
- sprintf(msgline, " %-12s%12s%12s%12s\n", "Memory", "Used", "Total", "Percentage");
+ sprintf(msgline, " %-16s%12s%12s%12s\n", "Memory", "Used", "Total", "Percentage");
addtostatus(msgline);
- sprintf(msgline, "&%s %-12s%11ldM%11ldM%11ld%%\n",
- colorname(physcolor), "Physical", memphysused, memphystotal, memphyspct);
+ sprintf(msgline, "&%s %-16s%11ldM%11ldM%11ld%%\n",
+ colorname(physcolor), "Real/Physical", memphysused, memphystotal, memphyspct);
addtostatus(msgline);
if (memactused != -1) {
if (memactpct <= 100)
- sprintf(msgline, "&%s %-12s%11ldM%11ldM%11ld%%\n",
- colorname(actcolor), "Actual", memactused, memphystotal, memactpct);
+ sprintf(msgline, "&%s %-16s%11ldM%11ldM%11ld%%\n",
+ colorname(actcolor), "Actual/Virtual", memactused, memacttotal, memactpct);
else
- sprintf(msgline, "&%s %-12s%11ldM%11ldM%11ld%% - invalid data\n",
- colorname(COL_CLEAR), "Actual", memactused, memphystotal, 0L);
+ sprintf(msgline, "&%s %-16s%11ldM%11ldM%11ld%% - invalid data\n",
+ colorname(COL_CLEAR), "Actual/Virtual", memactused, memacttotal, 0L);
addtostatus(msgline);
}
if (memswapused != -1) {
if (memswappct <= 100)
- sprintf(msgline, "&%s %-12s%11ldM%11ldM%11ld%%\n",
- colorname(swapcolor), "Swap", memswapused, memswaptotal, memswappct);
+ sprintf(msgline, "&%s %-16s%11ldM%11ldM%11ld%%\n",
+ colorname(swapcolor), "Swap/Page", memswapused, memswaptotal, memswappct);
else
- sprintf(msgline, "&%s %-12s%11ldM%11ldM%11ld%% - invalid data\n",
+ sprintf(msgline, "&%s %-16s%11ldM%11ldM%11ld%% - invalid data\n",
colorname(COL_CLEAR), "Swap", memswapused, memswaptotal, 0L);
addtostatus(msgline);
bbwin.c patch:
--- bbwin.c.ORIG 2016-01-27 09:54:13.989793661 -0600
+++ bbwin.c 2016-01-27 09:54:17.709793593 -0600
@@ -487,7 +487,7 @@
if (p) sscanf(p, "\nvirtual: %ld %ld", &memacttotal, &memactused);
dbgprintf("DEBUG Memory %ld %ld %ld %ld %ld\n", memphystotal, memphysused, memactused, memswaptotal, memswapused); /* DEBUG TODO Remove*/
unix_memory_report(hostname, clienttype, os, hinfo, fromline, timestr,
- memphystotal, memphysused, memactused, memswaptotal, memswapused);
+ memphystotal, memphysused, memacttotal, memactused, memswaptotal, memswapused);
}
splitmsg_done();
linux.c patch:
--- linux.c.ORIG 2016-01-27 09:54:23.861793480 -0600
+++ linux.c 2016-01-27 09:54:27.545793413 -0600
@@ -135,7 +135,7 @@
}
unix_memory_report(hostname, clienttype, os, hinfo, fromline, timestr,
- memphystotal, memphysused, memactused, memswaptotal, memswapused);
+ memphystotal, memphysused, memphystotal, memactused, memswaptotal, memswapused);
}
if (mdstatstr) {
After modifications the data from the client remains the same but what the server acts upon and displays now looks like:
Windows/XymonPS:
Memory Used Total Percentage
Real/Physical 1444M 5119M 28%
Actual/Virtual 57M 2047M 2%
Swap/Page 1742M 10237M 17%
Linux:
Memory Used Total Percentage
Real/Physical 3625M 3949M 91%
Actual/Virtual 889M 3949M 22%
Swap/Page 6M 4092M 0%
You will notice that I changed the labels of each line to include what the names are in the graphs. I figured if I was modifyng these scripts I might as well fix what has been annoying me for a long long time. :)
I hope this all makes sense.
Thanks,
John
Upcoming PTO:
_____________________________________________________________________
John Rothlisberger
IT Strategy, Infrastructure & Security - Technology Growth Platform
TGP for Business Process Outsourcing
Accenture
312.693.3136 office
_____________________________________________________________________
________________________________
This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy.
______________________________________________________________________________________
www.accenture.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20160128/4802309f/attachment.html>
More information about the Xymon
mailing list