[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [hobbit] RE: Default groupings

Hi Jason,

On Thu, Jul 20, 2006 at 10:15:40AM -0500, Kruse, Jason K. wrote:
> If you set a local test without a GROUP label it will cause all alerts
> that match to fire, regardless of whether they have a GROUP label or
> not.
> To use the GROUP functionality properly every test that requires a
> different group will require a label for every entry for a host/service,
> depending on how you've configured your hobbit-alerts.cfg  I would like
> the default to be to send a GROUP label with every set of tests and have
> the hobbit-alerts.cfg to have a default group for matching when no group
> is assigned.

If I understand you correctly, you have a setup like this. In
hobbit-clients.cfg you tag some e.g. disk statuses with a group label,
and some do not have a group label:

    HOST server1
    	DISK / 80 90 groupid=admins

    HOST server2
    	DISK / 85 95

And in hobbit-alerts.cfg you then have the "admins" group alerts going
to one recipient and the other alerts going to someone else:

    SERVICE=disk GROUP=admins
    	MAIL admins (at) foo.com

    	MAIL someone (at) foo.com

I think the real problem is different - when you have a test that does
not have any GROUP association, then alert rules with a specific GROUP
label should not be matched. And the current code doesn't behave quite
as it should; there's a patch attached to this mail. So with that patch
let me walk through how I think it should work.

If server1 gets a disk alert, it will be tagged with the "admins"
group label. This will currently match both of the alert rules; the
first one obviously because the group label matches, and the second one
because it doesn't have any group filter at all. This doesn't seem
right - and it's the problem you describe. I think the best solution
here is to change the last alert rule to

    	MAIL someone (at) foo.com

so it is clear that this rule is not used whenever the status is tagged
with a group label. Does that sound like a reasonable solution?

If server2 gets a disk alert, it won't have any group label. So it won't
match the first alert rule (group doesn't match), and it will match the
second one - both in the form with and without the EXGROUP setting. So
that behaves as one would expect.


--- lib/loadalerts.c	2006/07/20 16:06:41	1.15
+++ lib/loadalerts.c	2006/08/02 09:36:31
@@ -795,30 +795,41 @@
 	/* alert->groups is a comma-separated list of groups, so it needs some special handling */
-	if (crit && alert->groups && (*(alert->groups)) && (crit->groupspec || crit->exgroupspec)) {
-		char *grouplist = strdup(alert->groups);
+	if (crit && (crit->groupspec || crit->exgroupspec)) {
+		char *grouplist;
 		char *tokptr;
+		grouplist = (alert->groups && (*(alert->groups))) ? strdup(alert->groups) : NULL;
 		if (crit->groupspec) {
 			char *onegroup;
 			int iswanted = 0;
-			onegroup = strtok_r(grouplist, ",", &tokptr);
-			while (onegroup && !iswanted) {
-				iswanted = (namematch(onegroup, crit->groupspec, crit->groupspecre));
-				onegroup = strtok_r(NULL, ",", &tokptr);
+			if (grouplist) {
+				/* There is a group label on the alert, so it must match */
+				onegroup = strtok_r(grouplist, ",", &tokptr);
+				while (onegroup && !iswanted) {
+					iswanted = (namematch(onegroup, crit->groupspec, crit->groupspecre));
+					onegroup = strtok_r(NULL, ",", &tokptr);
+				}
 			if (!iswanted) {
+				/*
+				 * Either the alert had a group list that didn't match, or
+				 * there was no group list and the rule listed one.
+				 * In both cases, it's a failed match.
+				 */
 				traceprintf("Failed '%s' (group not in include list)\n", cfline);
-				xfree(grouplist);
+				if (grouplist) xfree(grouplist);
 				return 0; 
-		if (crit->exgroupspec) {
+		if (crit->exgroupspec && grouplist) {
 			char *onegroup;
+			/* Excluded groups are only handled when the alert does have a group list */
 			strcpy(grouplist, alert->groups); /* Might have been used in the include list */
 			onegroup = strtok_r(grouplist, ",", &tokptr);
 			while (onegroup) {
@@ -831,7 +842,7 @@
-		xfree(grouplist);
+		if (grouplist) xfree(grouplist);
 	if (crit && crit->pagespec && !namematch(pgname, crit->pagespec, crit->pagespecre)) { 
--- hobbitd/client_config.c	2006/07/20 16:06:41	1.46
+++ hobbitd/client_config.c	2006/08/02 09:05:34
@@ -341,7 +341,7 @@
 	return "";
-static char *grouplist;
+static char *grouplist = NULL;
 void clearalertgroups(void)
 	if (grouplist) xfree(grouplist);