[Xymon] Setting thresholds in analysis.cfg

Tue Aug 21 17:25:51 CEST 2018

Chris

I think this is the key part of the man page:

HOST=targetstring Rule matching a host by the hostname.  "targetstring"
       is either a comma-separated  list  of  hostnames  (from  the
hosts.cfg
       file),  "*"  to  indicate  "all  hosts",  or  a Perl-compatible
regular
       expression.

Are your host definitions comma-separated lists, or PCREs? They can't be
both.

So none of your hosts match, and the DEFAULT stanza is the one that applies.

J

On 9 June 2018 at 02:43, Seip, Christopher (HPN SIS team) <
chris.seip at hpe.com> wrote:

> I could a hand getting the basics of analysis.cfg worked out, please.
> Here's mine:
>
> # egrep -v '^#' analysis.cfg
>
> HOST=%swnfs06.rose.rdlabs.hpecorp.net,%swnfs06
>         DISK /disk/data 50 55
>         DISK    * 90 95
>         MEMSWAP 80 90
>
> HOST=%swnfs07.rose.rdlabs.hpecorp.net,%swnfs07
>         DISK /disk/data 92 96
>         DISK    * 90 95
>
> HOST=%hpnsvr18.rose.rdlabs.hpecorp.net,%hpnsvr18
>         DISK /BACKUP 98 99
>         DISK    * 90 95
>
> DEFAULT
>         # Ignore some usually uninteresting tmpfs mounts.
>         DISK    /dev IGNORE
>         DISK    /dev/shm IGNORE
>         DISK    /lib/init/rw IGNORE
>         DISK    /run IGNORE
>         # These are the built-in defaults. You should only modify these
>         # lines, not add new ones (no PROC, DISK, LOG ... lines).
>         UP      1h
>         LOAD    5.0 10.0
>         DISK    * 90 95
>         INODE   * 70 90
>         MEMPHYS 100 101
>         MEMSWAP 50 80
>         MEMACT  90 97
>
>
> Three issues with this:
>
> 1. Swap consumption in the first host, swnfs06, has been steady at 74%, so
> I was trying to hush the alerts with the MEMSWAP line. This change hasn't
> had any effect; I am still getting a memory low yellow-warning for
> swap/page usage on swnfs06.
>
> 2. On the same swnfs06 host, its /disk/data partition is 56% full, so my
> "DISK..50 60" line was an attempt to trigger a yellow alert. I was testing
> my understanding of the analysis.cfg file, but the filesystems test remains
> green.
>
> 3. And my 96% full /BACKUP drive on hpnsvr18 is issuing a red alert for
> being over the panic level of 95%, where I was trying to set the panic
> level at 99%.
>
> After wrestling with the man page and many experiments, I'm tossing this
> to the group for help. Seems very basic, but it's just not working for me.
> What'm I missing?
>
> I tried switching to the "threshold hostname" format, like this:
>
> # egrep -v '^#' analysis.cfg | head -11
>
> DISK /disk/data 50 55 HOST=%swnfs06.rose.rdlabs.hpecorp.net,%swnfs06
> DISK    * 90 95 HOST=%swnfs06.rose.rdlabs.hpecorp.net,%swnfs06
> MEMSWAP 80 90 HOST=%swnfs06.rose.rdlabs.hpecorp.net,%swnfs06
>
> DISK /disk/data 92 96 HOST=%swnfs07.rose.rdlabs.hpecorp.net,%swnfs07
> DISK    * 90 95 HOST=%swnfs07.rose.rdlabs.hpecorp.net,%swnfs07
>
> DISK /BACKUP 98 99 HOST=%hpnsvr18.rose.rdlabs.hpecorp.net,%hpnsvr18
> DISK    * 90 95 HOST=%hpnsvr18.rose.rdlabs.hpecorp.net,%hpnsvr18
>
> This produced no change in behavior. I am stopping and starting the Xymon
> server software and waiting for new html pages to generate after every
> change in the analysis.cfg.
>
> In my configuration report, I can see that every server configured for
> local memory tests has acquired the 80%/90% threshold setting, not just
> swnfs06. And my "DISK /disk/data 50 55" is having no effect at all on any
> host: The strings "50%" or "60%" appear nowhere in my configuration report.
>
> # egrep 'swnfs0[67]' hosts.cfg
>     16.93.247.204       swnfs06.rose.rdlabs.hpecorp.net #
> rpc=mount,nlockmgr,nfs,ypbind ssh
>     16.93.247.205       swnfs07.rose.rdlabs.hpecorp.net # NOCOLUMNS=files
> rpc=mount,nlockmgr,nfs,ypbind ssh
>         16.93.247.204   swnfs06.rose.rdlabs.hpecorp.net
>
> # xymoncmd xymond_client --dump-config
> DISK /disk/data 50% 55% 0 -1 red HOST=%swnfs06.rose.rdlabs.hpecorp.net,%swnfs06
> (line: 351)
> DISK * 90% 95% 0 -1 red HOST=%swnfs06.rose.rdlabs.hpecorp.net,%swnfs06
> (line: 352)
> MEMSWAP 80 90 HOST=%swnfs06.rose.rdlabs.hpecorp.net,%swnfs06 (line: 353)
> DISK /disk/data 92% 96% 0 -1 red HOST=%swnfs07.rose.rdlabs.hpecorp.net,%swnfs07
> (line: 355)
> DISK * 90% 95% 0 -1 red HOST=%swnfs07.rose.rdlabs.hpecorp.net,%swnfs07
> (line: 356)
> DISK /BACKUP 98% 99% 0 -1 red HOST=%hpnsvr18.rose.rdlabs.hpecorp.net,%hpnsvr18
> (line: 360)
> DISK * 90% 95% 0 -1 red HOST=%hpnsvr18.rose.rdlabs.hpecorp.net,%hpnsvr18
> (line: 361)
> DISK /dev IGNORE (line: 371)
> DISK /dev/shm IGNORE (line: 372)
> DISK /lib/init/rw IGNORE (line: 373)
> DISK /run IGNORE (line: 374)
> UP 3600 -1 (line: 377)
> LOAD 5.00 10.00 (line: 378)
> DISK * 90% 95% 0 -1 red (line: 379)
> INODE * 70% 90% 0 -1 red (line: 380)
> MEMREAL 100 101 (line: 381)
> MEMSWAP 50 80 (line: 382)
> MEMACT 90 97 (line: 383)
>
> Thanks for any insights you can provide. Feels like I'm making a wrong
> assumption about how analysis.cfg works.
>
> Best thing I can figure to do would be to switch to local configuration of
> my Xymon clients, but I'd rather manage custom thresholds centrally.
>
> - Chris
>
> _______________________________________________
> Xymon mailing list
> Xymon at xymon.com
> http://lists.xymon.com/mailman/listinfo/xymon
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20180822/8dd9f35e/attachment.html>