[Xymon] Constant red condition

john boris jborissr at gmail.com
Tue Aug 9 15:32:55 CEST 2016


Jeremy,
Thanks for the reply and because you asked the question it prompted me to
look a bit deeper. First this is a SCO box so things are different. I
checked and it seems on this installation the bbc user did not have the
correct authorization.

Rethinking this server is a fresh vm move from standalone so I think in the
restore process the authorizations might have not been set properly. I went
into the Account manager and redid the authorizations for the bbc user and
this allowed the bbc user to see the output of the ps command.

So all is well and lesson learned.


On Mon, Aug 8, 2016 at 10:16 PM, Jeremy Laidman <jlaidman at rebel-it.com.au>
wrote:

> John
>
> On 9 August 2016 at 04:53, john boris <jborissr at gmail.com> wrote:
>
>> I have a constant RED condition for the procs test. I checked all of my
>> other servers and they all have just one instance of cron yet this one
>> server has the one instance running but it come up red. These are all the
>> same OS, Same client version. Some are stand alone servers and others are
>> VM but this one machine just will not go Green.
>>
>> Any pointers on this?
>>
>
> Is the "cron" message showing too many or too few cron processes?
>
> Can you please show the message from the "procs" page that says "Processes
> NOT ok" including the list of monitored processes (with green/red dots)
> below that?  If you are able, show the whole "procs" page.  Also, show the
> relevant section from analysis.conf.
>
> What OS are you using?
>
> One thing about the "procs" page is that you can sometimes have unexpected
> PROC string matches.  For instance, if you have the line:
>
>         PROC cron
>
> then this definition will match any of these lines from the "ps" listing:
>
>   PID  PPID USER      STARTED S PRI %CPU     TIME %MEM   RSZ    VSZ CMD
>   918     1 root       Feb 22 S  22  0.0 00:00:10  0.0   776   2772
> /usr/sbin/cron
>  2482     1 root       Feb 22 S  22  0.0 00:00:00  0.0  2142  11268
> /usr/bin/crontab -e
>  5514     1 root       Feb 22 S  22  0.0 00:00:00  0.0 16484  30346 vi
> /etc/crontab
>
> and you end up with more "cron" processes reported than you actually have
> running.
>
> To counter this, you can use the full path of the binary in analysis.cfg,
> like so:
>
>         PROC /usr/sbin/cron
>
> However, this can also match a process like "/usr/sbin/cronlog-parser" or
> "less /usr/sbin/cron" or even "cp /usr/sbin/cron /path/to/slow/nfsmount".
>
> To only match where the processes name starts with the string you care
> about, change from a string match to a regex match like so:
>
>         PROC %^/usr/sbin/cron$
>
> If some of your process instances might have arguments, and you can take
> that into account:
>
>         PROC %^/usr/sbin/snmp($|\s)
>
> Another thing to note about the PROC monitoring is that it simply matches
> the strings in the "ps" output, with some assumptions about the column
> widths.  If you happen to have column before CMD that is wider than
> expected, then it can push other text into the CMD column.  For instance:
>
>   PID  PPID USER      STARTED S PRI %CPU     TIME %MEM   RSZ    VSZ CMD
> 20445 20442 root       Mar 20 S  23  0.0 00:00:00  0.0  4112  48044
> bin/abcd
> 30269 11056 roddy    11:38:27 S  23  0.0 00:00:00  0.0  3288  46732 sshd
> 30442 30269 freddy2496 14:41:17 S  23  0.0 00:00:00  0.0  2320  46864 sshd
> 30460 30442 wendy    11:38:27 S  23  0.0 00:00:00  0.0  2524  14192 -bash
>
> If I'm trying to match "sshd" then the above will match only once, because
> the second sshd line will have "sshd" matched against "4 sshd" (the "4"
> from the VSZ column) all because the username is wider than the expected
> maximum width for the USER field.
>
> Cheers
> Jeremy
>
>


-- 
John J. Boris, Sr.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20160809/f1b28dc3/attachment.html>


More information about the Xymon mailing list