[Xymon] alerts.cfg GROUP not matching

James, Tim A. Tim.James at navient.com
Mon Aug 17 20:19:38 CEST 2020


Some more digging and here is what I have found.  Prior to the upgrade to 4.3.30, I had groups with underscores in the names and groups with more than 7 characters.  Since the upgrade I can no longer do that.  I have re-written my analysis.cfg and alerts.cfg files to shorten the names and the commands to test the alerter now function correctly again, as do my alerts.

It's a bad workaround so can anyone explain why I might be seeing this issue?
Test cases and results:
Group name < 4 characters = FAIL
Group name > 7 characters = FAIL
Group name containing an underscore = FAIL
Group name containing a hyphen = FAIL
Group name with 4-7 characters and no underscores or hyphens = SUCCESS.

Running 4.3.30 on RHEL7 64bit.  Compiled from source.

Thanks in advance.

-Tim

From: Xymon <xymon-bounces at xymon.com> On Behalf Of James, Tim A.
Sent: Friday, August 14, 2020 11:42 PM
To: xymon at xymon.com
Subject: [Xymon] alerts.cfg GROUP not matching

I had all of this working and "something" changed and now the majority of my groups defined in my analysis.cfg file no longer alert.  I'm hoping it wasn't when I upgraded from 4.3.28 to 4.3.30 but I'm not ruling anything out.
I have sanitized the server name to foo.bar.com

Analysis.cfg snippet:

HOST=foo*
        DISK /opt/sas 90 95 GROUP=sas_support
        DISK /opt/sas/9.4 90 95 GROUP=sas_support
        DISK /opt/sas/9.4/depot2 90 95 GROUP=sas_support
        DISK /opt/sas/saslanding 90 95 GROUP=sas_support
        DISK /opt/sas/saslanding/in 90 95 GROUP=sas_support
        DISK /opt/sas/saslanding/out 90 95 GROUP=sas_support
        DISK /opt/sas/sasmain 90 99 GROUP=sas_support
        DISK /opt/sas/sassecure 90 99 GROUP=sas_support
        DISK /opt/sas/sassecure/modelingcrm 95 98 GROUP=sas_support
        DISK /opt/sas/sassecure/servicing 95 99 GROUP=sas_support
        DISK /opt/sas/sassecure/servicing/SCRA/MOENDs 90 95 GROUP=sas_support
        DISK /opt/sas/saswork 50 70 GROUP=sas_support

Alerts.cfg snippet:

GROUP=sas_support SERVICE=disk COLOR=red # SAS Application support team
SCRIPT /usr/local/xymon-server/server/ext/Create_SN_Ticket_From_Xymon-YP2-RP1.sh sas_support FORMAT=SMS DURATION>30 REPEAT=24h stop

GROUP=sas_support SERVICE=disk COLOR=yellow # SAS Application support team
MAIL helpdesk at foo.com<mailto:helpdesk at foo.com> FORMAT=SMS DURATION<20 REPEAT=24h stop

Obligatory test from the terminal:

[/usr/local/xymon-server/server/etc]
--> ../bin/xymoncmd xymond_alert --test foo.bar.com disk --color=yellow --group=sas_support

00103435 2020-08-14 23:11:06 Matching host:service:dgroup:page 'foo.bar.com:disk:NONE:PROD/PSAS' against rule line 165
00103435 2020-08-14 23:11:06 Failed 'GROUP=sas_support SERVICE=disk COLOR=red' (group not in include list)
00103435 2020-08-14 23:11:06 Matching host:service:dgroup:page 'foo.bar.com:disk:NONE:PROD/PSAS' against rule line 168
00103435 2020-08-14 23:11:06 Failed 'GROUP=sas_support SERVICE=disk COLOR=yellow' (group not in include list)

--> ../bin/xymoncmd xymond_alert --test foo.bar.com disk --group=sas_support
00104898 2020-08-14 23:26:59 Matching host:service:dgroup:page 'foo.bar.com:disk:NONE:PROD/PSAS' against rule line 165
00104898 2020-08-14 23:26:59 Failed 'GROUP=sas_support SERVICE=disk COLOR=red' (group not in include list)
00104898 2020-08-14 23:26:59 Matching host:service:dgroup:page 'foo.bar.com:disk:NONE:PROD/PSAS' against rule line 168
00104898 2020-08-14 23:26:59 Failed 'GROUP=sas_support SERVICE=disk COLOR=yellow' (group not in include list)

However the "red" second test, does match further along in the alerts file, just not with a GROUP definition, and the failure there is expected as I didn't specify the duration.
00104898 2020-08-14 23:26:59 Matching host:service:dgroup:page 'foo.bar.com:disk:NONE:PROD/PSAS' against rule line 304
00104898 2020-08-14 23:26:59 *** Match with 'HOST=%^.* SERVICE=disk COLOR=red' ***
00104898 2020-08-14 23:26:59 Matching host:service:dgroup:page 'foo.bar.com:disk:NONE:PROD/PSAS' against rule line 305
00104898 2020-08-14 23:26:59 Failed 'SCRIPT /usr/local/xymon-server/server/ext/Create_SN_Ticket_From_Xymon-YP2-RP1.sh UNIX FORMAT=SMS DURATION>5 REPEAT=25h' (min. duration 0<301)

Now get this.  Here are two more examples from the alerts.cfg file:
GROUP=satellite SERVICE=disk #test comment
MAIL coworker at foo.com<mailto:coworker at foo.com> FORMAT=SCRIPT stop

GROUP=unix # Linux Team support (default contact)
MAIL unix-alert at lists.foo.com<mailto:unix-alert at lists.foo.com> FORMAT=SMS DURATION<20 stop

And the respective tests from the terminal:
../bin/xymoncmd xymond_alert --test foo.bar.com disk --color=yellow --group=satellite
00104439 2020-08-14 23:21:49 Matching host:service:dgroup:page foo.bar.com:disk:NONE:PROD/PSAS' against rule line 147
00104439 2020-08-14 23:21:49 Failed 'GROUP=satellite SERVICE=disk' (group not in include list)

../bin/xymoncmd xymond_alert --test foo.bar.com disk --color=yellow --group=unix
00104535 2020-08-14 23:23:08 Matching host:service:dgroup:page 'foo.bar.com:disk:NONE:PROD/PSAS' against rule line 150
00104535 2020-08-14 23:23:08 *** Match with 'GROUP=unix' ***
00104535 2020-08-14 23:23:08 Matching host:service:dgroup:page 'foo.bar.com:disk:NONE:PROD/PSAS' against rule line 151
00104535 2020-08-14 23:23:08 *** Match with 'MAIL unix-alert at lists.foo.com<mailto:unix-alert at lists.foo.com> FORMAT=SMS DURATION<20 stop' ***
00104535 2020-08-14 23:23:08 Mail alert with command 'mail unix-alert at lists.foo.com<mailto:unix-alert at lists.foo.com>

I'm stumped.  Anyone out there have any idea what might be incorrect?


Tim James
Senior System Administrator
Navient


This E-Mail has been scanned for viruses.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20200817/365a6539/attachment.htm>


More information about the Xymon mailing list