[Xymon] Two procs/processes graph issues after server upgrade from 4.3.21 to 4.3.22-rc2
Japheth Cleaver
cleaver at terabithia.org
Thu Nov 12 00:53:48 CET 2015
There's a primary and a secondary issue here.
The chief problem was that TRACK and OPTIONAL seemed to not be tracked
as options to a test as a result of r7683 and r7686 (on some platforms).
Secondarily, 'nostale' is the default on svcstatus.sh pages, wherein it
will eventually not display an old RRD page on the status -- in this
case, because it hadn't been updated recently. I'm not sure how I feel
about the latter issue, but it's been that way for a while.
I believe the included patch fixes the main issue; I'm testing now (as
4.3.22-5 in http://terabithia.org/rpms/xymon/testing/el6/x86_64/).
This is enough to warrant a 4.3.23 release shortly, upon confirmation.
-jc
On 11/11/2015 9:54 AM, Axel Beckert wrote:
> Hi,
>
> [TL;DR: See Summary at the end.]
>
> I'm slowly running out of ideas with the following issue which has
> been noticed after I rolled out 4.3.22-rc2 on our two monitoring
> servers (still running the servers on 4.3.22-rc2 at the moment):
>
> The graph on
> https://xymon.phys.ethz.ch/xymon-cgi/svcstatus.sh?HOST=zwoelfi&SERVICE=procs
> is no more there, because
> https://xymon.phys.ethz.ch/xymon-cgi/showgraph.sh?host=zwoelfi&service=processes&graph_width=576&graph_height=120&disp=zwoelfi&nostale&color=green&graph_start=1447069296&graph_end=1447242096&graph=hourly&action=view
> returns only an 1x1 pixel PMG. The same happens on the second
> (independent, not slave) server, too.
>
> (No version changes on the affected clients. Those I checked have
> either 4.3.0-beta2 from Debian 7 or 4.3.17 from Debian 8.)
>
> I've found the following messages upon reloading the above URL in Apache's
> error.log:
>
> 2015-11-11 12:32:38.839801 Sendto failed: Connection refused
> 2015-11-11 12:32:38.839853 Sendto failed: Connection refused
> 2015-11-11 12:32:38.839871 Sendto failed: Connection refused
>
> I've found http://lists.xymon.com/archive/2015-February/041189.html
> with these messages, stopped the xymon service, removed all left over
> rrdctl.* files from /var/lib/xymon/tmp/ and started the xymon service
> again.
>
> Result is: I still only get an 1x1 pixel PNG, but the error messages
> are gone, i.e. the issues are likely unrelated as they were in the
> mailing list posting above.
>
> Then again on
> https://xymon.phys.ethz.ch/xymon-cgi/svcstatus.sh?HOST=zwoelfi&SERVICE=trends
> the "Process counts" graph is there (but seems not working):
>
> https://xymon.phys.ethz.ch/xymon-cgi/showgraph.sh?host=zwoelfi&service=processes&graph_width=576&graph_height=120&first=1&count=4&disp=zwoelfi&graph_start=1447069994&graph_end=1447242794&graph=hourly&action=view
>
> The difference between this and the first URL are (besides the time
> stamps): The first URL has nostale (without value) and color=green as additional query
> string parameters, and the second URL has instead first=1 and count=4
> as query string parameters.
>
> As soon as I remove the "nostale" without a value or add a value like
> e.g. "nostale=1", the graph is back again (but still no more working).
>
> So while the (reduced to the minimum parameters) URL
> https://xymon.phys.ethz.ch/xymon-cgi/showgraph.sh?host=zwoelfi&service=processes&graph=hourly&action=view
> shows (an empty) graph,
> https://xymon.phys.ethz.ch/xymon-cgi/showgraph.sh?host=zwoelfi&service=processes&disp=zwoelfi&graph=hourly&action=view&nostale
> gives a 1x1 pixel.
>
> With regards to the empty graph,
> https://xymon.phys.ethz.ch/xymon-cgi/showgraph.sh?host=zwoelfi&service=processes&graph=daily&action=view
> is not empty, it just shows that there is no more data since the 4th
> of November (when I updated the servers from 4.3.21 to 4.3.22-rc2).
>
> And indeed, in /var/lib/xymon/rrd/zwoelfi/, not all files have been
> updated anymore since 4th of November:
>
> # ls -l *proc*
> -rw-r--r-- 1 xymon xymon 19640 Nov 4 15:40 processes.apache2.rrd
> -rw-r--r-- 1 xymon xymon 19640 Nov 4 15:40 processes.automount.rrd
> -rw-r--r-- 1 xymon xymon 19640 Nov 4 15:40 processes.stress.rrd
> -rw-r--r-- 1 xymon xymon 19640 Nov 11 13:09 procs.rrd
> #
>
>
> Summary
> =======
>
> So there seem to be two issues with 4.3.22:
>
> * The graph in the procs check's page isn't displayed properly.
>
> Either
>
> + "nostale" should get a value in the page/template,
> + or the parsing of the "nostale" parameter without value in the
> showgraph CGI
>
> should be fixed. This sounds rather easy, but I'm not sure which
> variant is the expected one.
>
> * For some reason the processes.*.rrd files defined by "TRACK=" in
> analysis.cfg no more get updated.
>
> Here I currently have no good idea where this comes from. Maybe from
> one of the NCV-related changes. At least I found no configuration
> change (be it local or in the defaults/templates) which could have
> triggered this issue.
>
> Kind regards, Axel Beckert
-------------- next part --------------
--- xymond/client_config.c.chkflags32 2015-11-11 12:47:51.629681735 -0800
+++ xymond/client_config.c 2015-11-11 13:27:06.897682379 -0800
@@ -117,36 +117,36 @@
} c_paging_t;
-#define FCHK_NOEXIST (1ULL << 0)
-#define FCHK_TYPE (1ULL << 1)
-#define FCHK_MODE (1ULL << 2)
-#define FCHK_MINLINKS (1ULL << 3)
-#define FCHK_MAXLINKS (1ULL << 4)
-#define FCHK_EQLLINKS (1ULL << 5)
-#define FCHK_MINSIZE (1ULL << 6)
-#define FCHK_MAXSIZE (1ULL << 7)
-#define FCHK_EQLSIZE (1ULL << 8)
-#define FCHK_OWNERID (1ULL << 10)
-#define FCHK_OWNERSTR (1ULL << 11)
-#define FCHK_GROUPID (1ULL << 12)
-#define FCHK_GROUPSTR (1ULL << 13)
-#define FCHK_CTIMEMIN (1ULL << 16)
-#define FCHK_CTIMEMAX (1ULL << 17)
-#define FCHK_CTIMEEQL (1ULL << 18)
-#define FCHK_MTIMEMIN (1ULL << 19)
-#define FCHK_MTIMEMAX (1ULL << 20)
-#define FCHK_MTIMEEQL (1ULL << 21)
-#define FCHK_ATIMEMIN (1ULL << 22)
-#define FCHK_ATIMEMAX (1ULL << 23)
-#define FCHK_ATIMEEQL (1ULL << 24)
-#define FCHK_MD5 (1ULL << 25)
-#define FCHK_SHA1 (1ULL << 26)
-#define FCHK_SHA256 (1ULL << 27)
-#define FCHK_SHA512 (1ULL << 28)
-#define FCHK_SHA224 (1ULL << 29)
-#define FCHK_SHA384 (1ULL << 30)
-#define FCHK_RMD160 (1ULL << 31)
+#define FCHK_NOEXIST (1 << 0)
+#define FCHK_TYPE (1 << 1)
+#define FCHK_MODE (1 << 2)
+#define FCHK_MINLINKS (1 << 3)
+#define FCHK_MAXLINKS (1 << 4)
+#define FCHK_EQLLINKS (1 << 5)
+#define FCHK_MINSIZE (1 << 6)
+#define FCHK_MAXSIZE (1 << 7)
+#define FCHK_EQLSIZE (1 << 8)
+#define FCHK_OWNERID (1 << 10)
+#define FCHK_OWNERSTR (1 << 11)
+#define FCHK_GROUPID (1 << 12)
+#define FCHK_GROUPSTR (1 << 13)
+#define FCHK_CTIMEMIN (1 << 16)
+#define FCHK_CTIMEMAX (1 << 17)
+#define FCHK_CTIMEEQL (1 << 18)
+#define FCHK_MTIMEMIN (1 << 19)
+#define FCHK_MTIMEMAX (1 << 20)
+#define FCHK_MTIMEEQL (1 << 21)
+#define FCHK_ATIMEMIN (1 << 22)
+#define FCHK_ATIMEMAX (1 << 23)
+#define FCHK_ATIMEEQL (1 << 24)
+#define FCHK_MD5 (1 << 25)
+#define FCHK_SHA1 (1 << 26)
+#define FCHK_SHA256 (1 << 27)
+#define FCHK_SHA512 (1 << 28)
+#define FCHK_SHA224 (1 << 29)
+#define FCHK_SHA384 (1 << 30)
+#define FCHK_RMD160 (1 << 31)
-#define CHK_OPTIONAL (1ULL << 33)
-#define CHK_TRACKIT (1ULL << 34)
+#define CHK_OPTIONAL (1 << 0)
+#define CHK_TRACKIT (1 << 1)
typedef struct c_file_t {
@@ -253,5 +253,6 @@
ruletype_t ruletype;
int cfid;
- unsigned long long flags;
+ uint32_t flags;
+ uint32_t chkflags;
struct c_rule_t *next;
union {
@@ -979,5 +980,5 @@
}
else if (strncasecmp(tok, "track", 5) == 0) {
- currule->flags |= CHK_TRACKIT;
+ currule->chkflags |= CHK_TRACKIT;
if (*(tok+5) == '=') currule->rrdidstr = strdup(tok+6);
}
@@ -1028,5 +1029,5 @@
}
else if (strcasecmp(tok, "optional") == 0) {
- currule->flags |= CHK_OPTIONAL;
+ currule->chkflags |= CHK_OPTIONAL;
}
else if (idx == 0) {
@@ -1199,9 +1200,9 @@
}
else if (strncasecmp(tok, "track", 5) == 0) {
- currule->flags |= CHK_TRACKIT;
+ currule->chkflags |= CHK_TRACKIT;
if (*(tok+5) == '=') currule->rrdidstr = strdup(tok+6);
}
else if (strcasecmp(tok, "optional") == 0) {
- currule->flags |= CHK_OPTIONAL;
+ currule->chkflags |= CHK_OPTIONAL;
}
else {
@@ -1230,5 +1231,5 @@
}
else if (strncasecmp(tok, "track", 5) == 0) {
- currule->flags |= CHK_TRACKIT;
+ currule->chkflags |= CHK_TRACKIT;
if (*(tok+5) == '=') currule->rrdidstr = strdup(tok+6);
}
@@ -1292,5 +1293,5 @@
}
else if (strncasecmp(tok, "track", 5) == 0) {
- currule->flags |= CHK_TRACKIT;
+ currule->chkflags |= CHK_TRACKIT;
if (*(tok+5) == '=') currule->rrdidstr = strdup(tok+6);
}
@@ -1543,5 +1544,5 @@
}
else if (strncasecmp(tok, "track", 5) == 0) {
- currule->flags |= CHK_TRACKIT;
+ currule->chkflags |= CHK_TRACKIT;
if (*(tok+5) == '=') currule->rrdidstr = strdup(tok+6);
}
@@ -1906,10 +1907,10 @@
}
- if (rwalk->flags & CHK_TRACKIT) {
+ if (rwalk->chkflags & CHK_TRACKIT) {
printf(" TRACK");
if (rwalk->rrdidstr) printf("=%s", rwalk->rrdidstr);
}
- if (rwalk->flags & CHK_OPTIONAL) printf(" OPTIONAL");
+ if (rwalk->chkflags & CHK_OPTIONAL) printf(" OPTIONAL");
if (rwalk->timespec) printf(" TIME=%s", rwalk->timespec);
@@ -2568,5 +2569,5 @@
if (nofile) {
- if (!(rule->flags & CHK_OPTIONAL)) {
+ if (!(rule->chkflags & CHK_OPTIONAL)) {
if (COL_YELLOW > result) result = COL_YELLOW;
addalertgroup(rule->groups);
@@ -2751,5 +2752,5 @@
*anyrules = 1;
if (!exists) {
- if (rwalk->flags & CHK_OPTIONAL) goto nextcheck;
+ if (rwalk->chkflags & CHK_OPTIONAL) goto nextcheck;
if (!(rwalk->flags & FCHK_NOEXIST)) {
@@ -2984,5 +2985,5 @@
}
}
- if (rwalk->flags & CHK_TRACKIT) {
+ if (rwalk->chkflags & CHK_TRACKIT) {
*trackit = (trackit || (ftype == S_IFREG));
*id = rwalk->rrdidstr;
@@ -3066,5 +3067,5 @@
}
}
- if (rwalk->flags & CHK_TRACKIT) {
+ if (rwalk->chkflags & CHK_TRACKIT) {
*trackit = 1;
*id = rwalk->rrdidstr;
@@ -3238,5 +3239,5 @@
*warnage = rule->rule.mqqueue.warnage;
*critage = rule->rule.mqqueue.critage;
- if (rule->flags & CHK_TRACKIT) *trackit = (rule->rrdidstr ? rule->rrdidstr : "");
+ if (rule->chkflags & CHK_TRACKIT) *trackit = (rule->rrdidstr ? rule->rrdidstr : "");
return;
}
@@ -3471,5 +3472,5 @@
if ((*lowlim != 0) && (*count < *lowlim)) *color = (*walk)->rule->rule.proc.color;
if ((*uplim != -1) && (*count > *uplim)) *color = (*walk)->rule->rule.proc.color;
- *trackit = ((*walk)->rule->flags & CHK_TRACKIT);
+ *trackit = ((*walk)->rule->chkflags & CHK_TRACKIT);
*id = (*walk)->rule->rrdidstr;
if (group) *group = (*walk)->rule->groups;
@@ -3540,5 +3541,5 @@
if ((*lowlim != 0) && (*count < *lowlim)) *color = (*walk)->rule->rule.port.color;
if ((*uplim != -1) && (*count > *uplim)) *color = (*walk)->rule->rule.port.color;
- *trackit = ((*walk)->rule->flags & CHK_TRACKIT);
+ *trackit = ((*walk)->rule->chkflags & CHK_TRACKIT);
*id = (*walk)->rule->rrdidstr;
if (group) *group = (*walk)->rule->groups;
More information about the Xymon
mailing list