[Xymon] Detecting read-only file system in Linux

Root, Paul T Paul.Root at CenturyLink.com
Mon Mar 9 15:07:48 CET 2015


I just had this happen to me, to several servers because of my ancient Sun storage array.

The major one was a file system that has files going in and out all the time, with consistent names.
So I added   to client-local.cfg:

[machine]
File:`ls /path/to/file-*|tail -1`

And to analisys.cfg:
HOST=machine
                FILE        %^/path/to/file-.* RED mtime<1200

On the other hand you could write a script that runs the mount command, and parses that for 'ro' on a line.
You'd also want to exclude the proc, sys, and dev type filesystems. As well as CD-Rom or ISO mounts.

From: Xymon [mailto:xymon-bounces at xymon.com] On Behalf Of SebA
Sent: Monday, March 09, 2015 7:44 AM
To: xymon at xymon.com
Cc: 'Axel Beckert'
Subject: [Xymon] Detecting read-only file system in Linux

Hi,

I have been trying to find out if there is a way of Xymon detecting that a file-system in Linux has gone read-only as a result of a disk error (other than reporting it just the once via monitoring /var/log/messages).  Nothing is showing up in my Xymon server, but my xymon-client is a bit old: xymon-client-4.3.7-26.1.el5.tnt

I did a bit of Googling and I came up with these two links that may be relevant:
http://sisyphus.ru/en/srpm/Sisyphus/xymon/sources/8
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=764197

It seems that a RPM maintainer may have made some modifications to their version in order to catch disks in a read-only state (in the first link) and that there is mount-ro plugin that is part of the hobbit-plugins package in Debian / Ubuntu.  Does anyone have more information on either of these and whether any patches can be integrated upstream or plug-ins added to xymonton?  CCing Axel Beckert as he seems to have committed something to the mount-ro plugin recently: https://www.openhub.net/p/hobbit-plugins/commits

Although we have some Debian systems, I was looking for a solution for another Linux distro.

If I was to write something myself to do it, I would check /proc/mounts and the best command I could find was:
awk '$4~/(^|,)ro($|,)/' /proc/mounts
which outputs:
/dev/root / ext3 ro,data=ordered 0 0
with sample line:
/dev/root / ext3 ro,data=ordered 0 0
This command also produced a nice summary output that might be good to have on a Xymon status page:
cat /proc/mounts|sort|awk '{print $1 "\011" toupper(substr($4,0,2)
The following was at the bottom of /var/log/messages, but it does not suggest any very obvious alarm strings to add other than the last line without the 'dm-0', but it would be nicer to have something more generic still as textual messages can change between different versions of the O/S.

kernel: sd 0:0:0:0: Unhandled sense code
kernel: sd 0:0:0:0: SCSI error: return code = 0x08100002
kernel: Result: hostbyte=invalid driverbyte=DRIVER_SENSE,SUGGEST_OK
kernel: sda: Current: sense key: Hardware Error
kernel:    Add. Sense: Defect list error
kernel:
kernel: Buffer I/O error on device dm-0, logical block 1358756
kernel: lost page write due to I/O error on dm-0

Kind regards,

SebA

This communication is the property of CenturyLink and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20150309/a994d550/attachment.html>


More information about the Xymon mailing list