[xymon] Problem with disk monitoring

Johan Sjöberg johan.sjoberg at deltamanagement.se
Thu Sep 23 23:02:21 CEST 2010


I think this is somehow related to oversized status messages. We were having problems with this on 4.3.0 beta, and we also had a lot of oversized status messages (ports etc). Since we increased the max message size, we have not seen the problem with the disk test.

/Johan

From: Shailesh Paudyal [mailto:shailesh.paudyal at gmail.com]
Sent: den 23 september 2010 22:57
To: xymon at xymon.com
Subject: Re: [xymon] Problem with disk monitoring

Thanks Henrik,
But I still see the problem, please see the following alert came from xymon a week or so ago.....


red Su] Aug 22 01:26:51 EDT 2010 - Filesystems NOT ok &red 21% / (20461936% used) has reached the PANIC level (95%) &red 1% /app (65009052% used) has reached the PANIC level (95%) &red 1% /home (112112360% used) has reached the PANIC level (95%) &red 6% /var (8933348% used) has reached the PANIC level (95%) &red 1% /tmp (18638136% used) has reached the PANIC level (95%) &red 49% /boot (48938% used) has reached the PANIC level (95%) &red 11% /u01 (218810168% used) has reached the PANIC level (95%) &red 7% /u04 (228560164% used) has reached the PANIC level (95%) &red 35% /u02 (1154279480% used) has reached the PANIC level (95%) &red 24% /old_u02 (1507070236% used) has reached the PANIC level (95%)



Filesystem         1

24-]locks      Used Available Capacity Mounted on

/dev/sda5             27054004   5195620  20461936      21% /

/dev/sdb1             68814716    253696  65009052       1% /app

/dev/sdc2            118417044    192356 112112360       1% /home

/dev/sda3              9920624    475208   8933348       6% /var

/dev/sdc1             19840892    178616  18638136       1% /tmp

/dev/sda1               101086     46929     48938      49% /boot

/dev/mapper/VolGroup01-u01 258022788  26105832 218810168      11% /u01

/dev/mapper/VolGroup04-u04 258022788  16355836 228560164       7% /u04

/dev/mapper/VolGroup03-u03 1857784872 609135396 1154279480      35% /u02

/dev/mapper/VolGroup02-u02 2064204960 452279172 1507070236      24% /old_u02

On Thu, Sep 23, 2010 at 3:45 PM, Henrik Størner <henrik at hswn.dk<mailto:henrik at hswn.dk>> wrote:
This is a somewhat old post, but I'm responding anyway ...

In <AANLkTinFdgiz2ie3NCxhuop8picZj6izZPdH6fESQfif at mail.gmail.com<mailto:AANLkTinFdgiz2ie3NCxhuop8picZj6izZPdH6fESQfif at mail.gmail.com>> Steve Holmes <sholmes42 at mac.com<mailto:sholmes42 at mac.com>> writes:

>>>> Please see below, there is a problem with disk monitoring on one of the
>>>> server. Can some one tell me if I did something wrong?
>>>>
>>>> W]d Jul 28 10:34:31 EDT 2010 - Filesystems NOT ok
>>>>
>>>>  7% / (8816628% used) has reached the PANIC level (95%)
>>>>  38% /u01 (90371708% used) has reached the PANIC level (95%)
>>>>
>>>> Filesystem         10
>>>> 4-b]ocks      Used Available Capacity Mounted on
>>>> /dev/sda9              9920592    591896   8816628       7% /
>>>> /dev/sda10           152435112  54195172  90371708      38% /u01
>>>> /dev/sda8              9920592    154056   9254468       2% /tmp
>It appears that Xymon has slipped one field to the left in parsing the df
>output. The string at the beginning of each of the lines before the actual
>df ouput should be the name of the filesystem (plus an icon, but we'll
>ignore that for now). Then it is using the available number as the percent
>used, which, of course, is huge.

>I don't know if this is causing the problem but there is some funkiness with
>the first line of the df output. It is broken between the 10 and the 4 and
>there is a ']' instead of an 'l' in the word "blocks". Maybe this is a
>cut/paste error, but if not, it is certainly not right.

There is a bug somewhere in the Xymon 4.3.0-beta code with the "df"
status handling. I've seen it cause random RRD files to appear for
systems that don't have such filesystems, and occasionally it would
also result in this behaviour where a disk status goes wild.

I haven't been able to nail it yet, mostly because it seems to happen
very rarely and completely without any pattern. It would seem like
some sort of memory corruption problem, but I've had the client-message
handler running for days with valgrind (memory access checker) enabled,
and it came up with nothing.

Very annoying.


Regards,
Henrik


To unsubscribe from the xymon list, send an e-mail to
xymon-unsubscribe at xymon.com<mailto:xymon-unsubscribe at xymon.com>




--
Shailesh K. Paudyal
shailesh.paudyal at gmail.com<mailto:shailesh.paudyal at gmail.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20100923/23550472/attachment.html>


More information about the Xymon mailing list