[Xymon] Xymon "port" check intermittent failures for ssh TCP port 22 state=LISTEN
Ryan Novosielski
novosirj at rutgers.edu
Fri Jul 7 07:05:14 CEST 2017
Any chance this is truncation happening? That test can have a lot of output.
--
____
|| \\UTGERS, |---------------------------*O*---------------------------
||_// the State | Ryan Novosielski - novosirj at rutgers.edu<mailto:novosirj at rutgers.edu>
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
|| \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark
`'
On Jul 7, 2017, at 00:47, Jeremy Laidman <jlaidman at rebel-it.com.au<mailto:jlaidman at rebel-it.com.au>> wrote:
Hi
I'm getting what appear to be false-positives for the port test that is monitoring the LISTEN socket for port 22, as opened by the sshd daemon. A few times a month, Xymon will show that the server is not listening on port 22, and 5 minutes later, the listening port is back again. The sshd process has never crashed or been reconfigured (eg with SIGHUP), and no other listening ports are showing the same behaviour. The client messages for the server during these events are complete and uncorrupted.
The simplest fix is to use delayred to suppress alerts for 5 minutes. However, I would like to work out what's causing this behaviour. I don't believe this a problem with Xymon at all, and instead the netstat output in the client message is exactly what the OS provided the Xymon client. My guess is that it's due to a the way sshd works - perhaps it periodically rebinds to the socket - but nothing in the sshd logs seems to correlate with these events. If anyone can suggest what might be causing this, or how to investigate further, I'd be grateful.
This problem happens for about a quarter of the servers in a pool, and no others. All servers are identical in OS, software and general configuration, but the servers affected by this tend to be the ones taking the most traffic and under the most load (although there's plenty of spare CPU cycles even on the most heavily-used server). I have two Xymon servers, each monitoring independently of the other, and this problem is reported by both Xymon servers, although at completely different dates and times.
Cheers
Jeremy
_______________________________________________
Xymon mailing list
Xymon at xymon.com<mailto:Xymon at xymon.com>
http://lists.xymon.com/mailman/listinfo/xymon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20170707/d18e39aa/attachment.html>
More information about the Xymon
mailing list