[Xymon] Detecting read-only file system in Linux

Axel Beckert beckert at phys.ethz.ch
Mon Mar 9 14:14:03 CET 2015


Hi,

On Mon, Mar 09, 2015 at 12:44:03PM -0000, SebA wrote:
> I have been trying to find out if there is a way of Xymon detecting that a
> file-system in Linux has gone read-only as a result of a disk error (other
> than reporting it just the once via monitoring /var/log/messages).  Nothing
> is showing up in my Xymon server, but my xymon-client is a bit old:
> xymon-client-4.3.7-26.1.el5.tnt
>  
> I did a bit of Googling and I came up with these two links that may be
> relevant:
> http://sisyphus.ru/en/srpm/Sisyphus/xymon/sources/8
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=764197
>  
> It seems that a RPM maintainer may have made some modifications to their
> version in order to catch disks in a read-only state (in the first link) and
> that there is mount-ro plugin that is part of the hobbit-plugins package in
> Debian / Ubuntu.  Does anyone have more information on either of
> these

I'm one of the maintainers of Debian's hobbit-plugins package, so
yes. :-)

> and whether any patches can be integrated upstream or plug-ins added
> to xymonton?

I'm not sure where exactly at https://wiki.xymonton.org/ I should add
our set of plugins.

> CCing Axel Beckert as he seems to have committed
> something to the mount-ro plugin recently:
> https://www.openhub.net/p/hobbit-plugins/commits

Hrm, OpenHub seems horribly out of date with most projects
recently... The full view on that Git repo is at
https://anonscm.debian.org/cgit/collab-maint/hobbit-plugins.git/

The source code of the mount-ro plugin is quite simple:
https://anonscm.debian.org/cgit/collab-maint/hobbit-plugins.git/tree/misc.d/mount-ro

It's though not a direct plugin but meant for the meta-plugin "misc"
which calls all scripts in /etc/xymon/misc.d/ and summarizes their
exit codes into a single check. This is meant for checks which get
yellow/red only very seldom and where you don't want to waste a whole
column for it.

misc plugin:
https://anonscm.debian.org/cgit/collab-maint/hobbit-plugins.git/tree/client-ext/misc

Hobbit.pm used in the misc plugin and many other plugins in that
package:
https://anonscm.debian.org/cgit/collab-maint/hobbit-plugins.git/tree/perl/Hobbit.pm

> The following was at the bottom of /var/log/messages, but it does not
> suggest any very obvious alarm strings to add other than the last line
> without the 'dm-0', but it would be nicer to have something more generic
> still as textual messages can change between different versions of the O/S.
>  
> kernel: sd 0:0:0:0: Unhandled sense code
> kernel: sd 0:0:0:0: SCSI error: return code = 0x08100002
> kernel: Result: hostbyte=invalid driverbyte=DRIVER_SENSE,SUGGEST_OK
> kernel: sda: Current: sense key: Hardware Error
> kernel:    Add. Sense: Defect list error
> kernel:
> kernel: Buffer I/O error on device dm-0, logical block 1358756
> kernel: lost page write due to I/O error on dm-0

That's probably something which can be caught via the LOG keyword in
analysis.cfg.

		Kind regards, Axel Beckert
-- 
Axel Beckert <beckert at phys.ethz.ch>       support: +41 44 633 26 68
IT Services Group, HPT H 6                  voice: +41 44 633 41 89
Departement of Physics, ETH Zurich
CH-8093 Zurich, Switzerland		   http://nic.phys.ethz.ch/



More information about the Xymon mailing list