[Xymon] Xymon "port" check intermittent failures for ssh TCP port 22 state=LISTEN

Jeremy Laidman jlaidman at rebel-it.com.au
Sat Jul 8 00:32:30 CEST 2017


Yes, I do the network test also. This means I could just disable 22 in the
port test, and rely on the network test. It's an adequate work-around in
this case. Thanks.

I'd still like to know why it's a problem.

J

On 8 Jul. 2017 04:08, "Mike Burger" <mburger at bubbanfriends.org> wrote:

On 2017-07-07 2:51 am, Jeremy Laidman wrote:

Not much chance, really. This was my first guess at the cause. The [ports]
section appears complete (doesn't have its own limit as far as I know), the
[clock] section is present at the end, and the UTC: datestamp line is
present as the last line. Hence no artefacts I would expect to see when
truncation takes place.

Also, the client messages are less than 300kB, whereas the default limit is
512kB and I've bumped that up to 2MB.


On 7 July 2017 at 15:05, Ryan Novosielski <novosirj at rutgers.edu> wrote:

> Any chance this is truncation happening? That test can have a lot of
> output.
>
> --
> ____
> || \\UTGERS,       |---------------------------*O
> *---------------------------
> ||_// the State     |         Ryan Novosielski - novosirj at rutgers.edu
> || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
> ||  \\    of NJ     | Office of Advanced Research Computing - MSB C630,
> Newark
>     `'
>
> On Jul 7, 2017, at 00:47, Jeremy Laidman <jlaidman at rebel-it.com.au> wrote:
>
> Hi
>
> I'm getting what appear to be false-positives for the port test that is
> monitoring the LISTEN socket for port 22, as opened by the sshd daemon. A
> few times a month, Xymon will show that the server is not listening on port
> 22, and 5 minutes later, the listening port is back again. The sshd process
> has never crashed or been reconfigured (eg with SIGHUP), and no other
> listening ports are showing the same behaviour.  The client messages for
> the server during these events are complete and uncorrupted.
>
> The simplest fix is to use delayred to suppress alerts for 5 minutes.
> However, I would like to work out what's causing this behaviour. I don't
> believe this a problem with Xymon at all, and instead the netstat output in
> the client message is exactly what the OS provided the Xymon client. My
> guess is that it's due to a the way sshd works - perhaps it periodically
> rebinds to the socket - but nothing in the sshd logs seems to correlate
> with these events. If anyone can suggest what might be causing this, or how
> to investigate further, I'd be grateful.
>
> This problem happens for about a quarter of the servers in a pool, and no
> others. All servers are identical in OS, software and general
> configuration, but the servers affected by this tend to be the ones taking
> the most traffic and under the most load (although there's plenty of spare
> CPU cycles even on the most heavily-used server). I have two Xymon servers,
> each monitoring independently of the other, and this problem is reported by
> both Xymon servers, although at completely different dates and times.
>
> Cheers
> Jeremy
>
>
>
Have you considered adding the SSH network test, in conjunction?

-- 
Mike Burger
http://www.bubbanfriends.org

"It's always suicide-mission this, save-the-planet that. No one ever just
stops by to say 'hi' anymore." --Colonel Jack O'Neill, SG1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20170708/204462bc/attachment.html>


More information about the Xymon mailing list