Scheduled disable causes crash?

Johan Sjöberg johan.sjoberg at deltamanagement.se
Tue Nov 23 08:11:48 CET 2010


Hi.

This happened again this morning at 03:00 when we had a scheduled disable. When Xymon stops working, it generates a core file every 2 or 3 minutes in /usr/local/xymon/data/acks.

They all look like this (but with different strings in "#0  0x0076d402 in __kernel_vsyscall ()":

[root at mon01 acks]# gdb /usr/local/xymon/server/bin/bbgen core.9464
GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-23.el5_5.2)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/local/xymon/server/bin/bbgen...done.
Reading symbols from /lib/libpcre.so.0...(no debugging symbols found)...done.
Loaded symbols for /lib/libpcre.so.0
Reading symbols from /lib/librt.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/librt.so.1
Reading symbols from /lib/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/libpthread.so.0...(no debugging symbols found)...done.
Loaded symbols for /lib/libpthread.so.0
Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux.so.2
Core was generated by `bbgen --recentgifs --subpagecolumns=2 --report'.
Program terminated with signal 6, Aborted.
#0  0x0076d402 in __kernel_vsyscall ()

Do you have any idea what might be causing this, or how we can proceed to try to find out more about the problem?

/Johan

From: Johan Sjöberg [mailto:johan.sjoberg at deltamanagement.se]
Sent: den 20 oktober 2010 12:32
To: xymon at xymon.com
Subject: [xymon] RE: Scheduled disable causes crash?

Hi.

This happened again yesterday morning. We found that core dumps had been created. Here is what gdb tells us about the core dumps. Does anyone have a clue of what might be causing this problem? The Xymon server is running CentOS 5.5 32-bit .

[root at mon01 acks]# file core.28581
core.28581: ELF 32-bit LSB core file Intel 80386, version 1 (SYSV), SVR4-style, from 'bbgen'

[root at mon01 acks]# gdb /usr/local/xymon/server/bin/bbgen core.28581
Reading symbols from /usr/local/xymon/server/bin/bbgen...done.

warning: .dynamic section for "/lib/libc.so.6" is not at the expected address

warning: difference appears to be caused by prelink, adjusting expectations
Reading symbols from /lib/libpcre.so.0...(no debugging symbols found)...done.
Loaded symbols for /lib/libpcre.so.0
Reading symbols from /lib/librt.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/librt.so.1
Reading symbols from /lib/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/libpthread.so.0...(no debugging symbols found)...done.
Loaded symbols for /lib/libpthread.so.0
Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux.so.2
Core was generated by `bbgen --recentgifs --subpagecolumns=2 --report'.
Program terminated with signal 6, Aborted.
#0 0x00b2f402 in __kernel_vsyscall ()

/Johan

From: Johan Sjöberg [mailto:johan.sjoberg at deltamanagement.se]
Sent: den 5 oktober 2010 11:10
To: xymon at xymon.com
Subject: [xymon] Scheduled disable causes crash?

Hi!

During the last month, we have had some problems with Xymon when using scheduled disabled (added from the web interface).

The first problem we had was on September 17th, when hobbitd crashed while/after running the scheduled disable. We got the following error in hobbit.log
2010-09-17 05:00:00 Fatal error in select: Bad file descriptor
2010-09-17 05:00:00 Setup complete


After that incident, I enabled verbose logging for hobbitd, and turned up the logging for some other process as well. This morning at 06:00 we had a new scheduled disable. This time hobbitd just stopped logging after running the disable. The web interface did not work correctly. When clicking a test, an "Internal server error" message was displayed. Also, the bbgen test went purple (last update at 05:59:26).

The only errors I have been able to find in the logs are in bb-display.log:
2010-10-05 06:00:26 xstrdup: Cannot dup NULL string
2010-10-05 06:01:26 xstrdup: Cannot dup NULL string
2010-10-05 06:02:27 xstrdup: Cannot dup NULL string
2010-10-05 06:12:30 xstrdup: Cannot dup NULL string
2010-10-05 06:13:35 xstrdup: Cannot dup NULL string
2010-10-05 06:14:39 xstrdup: Cannot dup NULL string
2010-10-05 06:24:40 xstrdup: Cannot dup NULL string
2010-10-05 06:25:41 xstrdup: Cannot dup NULL string
2010-10-05 06:26:42 xstrdup: Cannot dup NULL string
2010-10-05 06:36:47 xstrdup: Cannot dup NULL string
2010-10-05 06:37:48 xstrdup: Cannot dup NULL string
2010-10-05 06:39:54 xstrdup: Cannot dup NULL string
2010-10-05 06:40:54 xstrdup: Cannot dup NULL string
2010-10-05 06:41:54 xstrdup: Cannot dup NULL string

Xymon was restarted at 06:43.
Here is the hobbit.log from the time. It was too large to paste in the mail.
http://pastebin.com/JuU0BHje

We are running Xymon 4.2.3 on CentOS 5.

Best regards,
Johan Sjöberg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20101123/c0f5f91c/attachment.html>


More information about the Xymon mailing list