[hobbit] Hobbit server crashing
Everett, Vernon
Vernon.Everett at woodside.com.au
Thu Oct 9 15:38:17 CEST 2008
Hmm, that is what I suspected, because I found this in the log file after sending my mail
This might be something conclusive, if I had even the foggiest idea what it meant.
Hoping some of the smarter list members can assist.
This log entry is not time-stamped, but it was the last entry before I did the restart.
>From hobbitlaunch.cfg
[bbnet]
ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg
NEEDS hobbitd
CMD bbtest-net --report --ping --checkresponse
LOGFILE $BBSERVERLOGS/bb-network.log
INTERVAL 5m
>From bb-network.log
*** glibc detected *** bbtest-net: double free or corruption (out): 0x000000000a96dd20 ***
======= Backtrace: =========
/lib64/libc.so.6[0x3db7a71634]
/lib64/libc.so.6(cfree+0x8c)[0x3db7a74c5c]
bbtest-net[0x42493a]
bbtest-net[0x422bdf]
bbtest-net[0x422d7e]
bbtest-net[0x40f7d7]
bbtest-net[0x4076cc]
bbtest-net[0x4088c6]
/lib64/libc.so.6(__libc_start_main+0xf4)[0x3db7a1d8b4]
bbtest-net[0x4039e9]
======= Memory map: ========
00400000-00430000 r-xp 00000000 fd:01 389432 /usr/lib/hobbit/server/bin/bbtest-net
00630000-00631000 rw-p 00030000 fd:01 389432 /usr/lib/hobbit/server/bin/bbtest-net
00631000-00637000 rw-p 00631000 00:00 0
00830000-00832000 rw-p 00030000 fd:01 389432 /usr/lib/hobbit/server/bin/bbtest-net
0a8d1000-0aa1b000 rw-p 0a8d1000 00:00 0
3224600000-3224638000 r-xp 00000000 fd:01 68061 /usr/lib64/libldap-2.3.so.0.2.15
3224638000-3224838000 ---p 00038000 fd:01 68061 /usr/lib64/libldap-2.3.so.0.2.15
3224838000-322483a000 rw-p 00038000 fd:01 68061 /usr/lib64/libldap-2.3.so.0.2.15
3366400000-3366443000 r-xp 00000000 fd:00 75879 /lib64/libssl.so.0.9.8b
3366443000-3366643000 ---p 00043000 fd:00 75879 /lib64/libssl.so.0.9.8b
3366643000-3366649000 rw-p 00043000 fd:00 75879 /lib64/libssl.so.0.9.8b
3368000000-3368125000 r-xp 00000000 fd:00 75876 /lib64/libcrypto.so.0.9.8b
3368125000-3368325000 ---p 00125000 fd:00 75876 /lib64/libcrypto.so.0.9.8b
3368325000-3368344000 rw-p 00125000 fd:00 75876 /lib64/libcrypto.so.0.9.8b
3368344000-3368348000 rw-p 3368344000 00:00 0
385c600000-385c63b000 r-xp 00000000 fd:00 75779 /lib64/libsepol.so.1
385c63b000-385c83b000 ---p 0003b000 fd:00 75779 /lib64/libsepol.so.1
385c83b000-385c83c000 rw-p 0003b000 fd:00 75779 /lib64/libsepol.so.1
385c83c000-385c846000 rw-p 385c83c000 00:00 0
385ca00000-385ca15000 r-xp 00000000 fd:00 75786 /lib64/libselinux.so.1
385ca15000-385cc15000 ---p 00015000 fd:00 75786 /lib64/libselinux.so.1
385cc15000-385cc17000 rw-p 00015000 fd:00 75786 /lib64/libselinux.so.1
385cc17000-385cc18000 rw-p 385cc17000 00:00 0
385ce00000-385ce8f000 r-xp 00000000 fd:01 68057 /usr/lib64/libkrb5.so.3.3
385ce8f000-385d08e000 ---p 0008f000 fd:01 68057 /usr/lib64/libkrb5.so.3.3
385d08e000-385d092000 rw-p 0008e000 fd:01 68057 /usr/lib64/libkrb5.so.3.3
385d600000-385d608000 r-xp 00000000 fd:01 68055 /usr/lib64/libkrb5support.so.0.1
385d608000-385d807000 ---p 00008000 fd:01 68055 /usr/lib64/libkrb5support.so.0.1
385d807000-385d808000 rw-p 00007000 fd:01 68055 /usr/lib64/libkrb5support.so.0.1
385da00000-385da24000 r-xp 00000000 fd:01 68056 /usr/lib64/libk5crypto.so.3.1
385da24000-385dc23000 ---p 00024000 fd:01 68056 /usr/lib64/libk5crypto.so.3.1
385dc23000-385dc25000 rw-p 00023000 fd:01 68056 /usr/lib64/libk5crypto.so.3.1
385de00000-385de2c000 r-xp 00000000 fd:01 68058 /usr/lib64/libgssapi_krb5.so.2.2
385de2c000-385e02c000 ---p 0002c000 fd:01 68058 /usr/lib64/libgssapi_krb5.so.2.2
385e02c000-385e02e000 rw-p 0002c000 fd:01 68058 /usr/lib64/libgssapi_krb5.so.2.2
3af4400000-3af440d000 r-xp 00000000 fd:01 68095 /usr/lib64/liblber-2.3.so.0.2.15
3af440d000-3af460d000 ---p 0000d000 fd:01 68095 /usr/lib64/liblber-2.3.so.0.2.15
3af460d000-3af460e000 rw-p 0000d000 fd:01 68095 /usr/lib64/liblber-2.3.so.0.2.15
3db7600000-3db761a000 r-xp 00000000 fd:00 75791 /lib64/ld-2.5.so
3db781a000-3db781b000 r--p 0001a000 fd:00 75791 /lib64/ld-2.5.so
3db781b000-3db781c000 rw-p 0001b000 fd:00 75791 /lib64/ld-2.5.so
3db7a00000-3db7b4a000 r-xp 00000000 fd:00 75797 /lib64/libc-2.5.so
3db7b4a000-3db7d49000 ---p 0014a000 fd:00 75797 /lib64/libc-2.5.so
3db7d49000-3db7d4d000 r--p 00149000 fd:00 75797 /lib64/libc-2.5.so
3db7d4d000-3db7d4e000 rw-p 0014d000 fd:00 75797 /lib64/libc-2.5.so
3db7d4e000-3db7d53000 rw-p 3db7d4e000 00:00 0
3db7e00000-3db7e02000 r-xp 00000000 fd:00 75840 /lib64/libdl-2.5.so
3db7e02000-3db8002000 ---p 00002000 fd:00 75840 /lib64/libdl-2.5.so
3db8002000-3db8003000 r--p 00002000 fd:00 75840 /lib64/libdl-2.5.so
3db8003000-3db8004000 rw-p 00003000 fd:00 75840 /lib64/libdl-2.5.so
3db8200000-3db8218000 r-xp 00000000 fd:01 67958 /usr/lib64/libsasl2.so.2.0.22
3db8218000-3db8418000 ---p 00018000 fd:01 67958 /usr/lib64/libsasl2.so.2.0.22
3db8418000-3db8419000 rw-p 00018000 fd:01 67958 /usr/lib64/libsasl2.so.2.0.22
3db8600000-3db8614000 r-xp 00000000 fd:01 67257 /usr/lib64/libz.so.1.2.3
3db8614000-3db8813000 ---p 00014000 fd:01 67257 /usr/lib64/libz.so.1.2.3
3db8813000-3db8814000 rw-p 00013000 fd:01 67257 /usr/lib64/libz.so.1.2.3
3db9a00000-3db9a09000 r-xp 00000000 fd:00 76086 /lib64/libcrypt-2.5.so
3db9a09000-3db9c08000 ---p 00009000 fd:00 76086 /lib64/libcrypt-2.5.so
3db9c08000-3db9c09000 r--p 00008000 fd:00 76086 /lib64/libcrypt-2.5.so
3db9c09000-3db9c0a000 rw-p 00009000 fd:00 76086 /lib64/libcrypt-2.5.so
3db9c0a000-3db9c38000 rw-p 3db9c0a000 00:00 0
3dba600000-3dba611000 r-xp 00000000 fd:00 76082 /lib64/libresolv-2.5.so
3dba611000-3dba811000 ---p 00011000 fd:00 76082 /lib64/libresolv-2.5.so
3dba811000-3dba812000 r--p 00011000 fd:00 76082 /lib64/libresolv-2.5.so
3dba812000-3dba813000 rw-p 00012000 fd:00 76082 /lib64/libresolv-2.5.so
3dba813000-3dba815000 rw-p 3dba813000 00:00 0
3dbaa00000-3dbaa02000 r-xp 00000000 fd:00 76083 /lib64/libcom_err.so.2.1
3dbaa02000-3dbac01000 ---p 00002000 fd:00 76083 /lib64/libcom_err.so.2.1
3dbac01000-3dbac02000 rw-p 00001000 fd:00 76083 /lib64/libcom_err.so.2.1
3dbba00000-3dbba02000 r-xp 00000000 fd:00 76080 /lib64/libkeyutils-1.2.so
3dbba02000-3dbbc01000 ---p 00002000 fd:00 76080 /lib64/libkeyutils-1.2.so
3dbbc01000-3dbbc02000 rw-p 00001000 fd:00 76080 /lib64/libkeyutils-1.2.so
3dbc200000-3dbc20d000 r-xp 00000000 fd:00 75803 /lib64/libgcc_s-4.1.2-20080102.so.1
3dbc20d000-3dbc40d000 ---p 0000d000 fd:00 75803 /lib64/libgcc_s-4.1.2-20080102.so.1
3dbc40d000-3dbc40e000 rw-p 0000d000 fd:00 75803 /lib64/libgcc_s-4.1.2-20080102.so.1
2b224ebb4000-2b224ebb6000 rw-p 2b224ebb4000 00:00 0
2b224ebc1000-2b224ebc9000 rw-p 2b224ebc1000 00:00 0
2b224ebc9000-2b224ebd3000 r-xp 00000000 fd:00 75873 /lib64/libnss_files-2.5.so
2b224ebd3000-2b224edd2000 ---p 0000a000 fd:00 75873 /lib64/libnss_files-2.5.so
2b224edd2000-2b224edd3000 r--p 00009000 fd:00 75873 /lib64/libnss_files-2.5.so
2b224edd3000-2b224edd4000 rw-p 0000a000 fd:00 75873 /lib64/libnss_files-2.5.so
2b2250000000-2b2250021000 rw-p 2b2250000000 00:00 0
2b2250021000-2b2254000000 ---p 2b2250021000 00:00 0
7fff5bee0000-7fff5bef6000 rw-p 7fff5bee0000 00:00 0 [stack]
ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0 [vdso]
-----Original Message-----
From: Henrik Stoerner [mailto:henrik at hswn.dk]
Sent: Thursday, 9 October 2008 9:18 PM
To: hobbit at hswn.dk
Subject: Re: [hobbit] Hobbit server crashing
In <A3D12FAD74FC8B46991703F40C182BAB01078343 at permls102.wde.woodside.com.au> "Everett, Vernon" <Vernon.Everett at woodside.com.au> writes:
>My Hobbit server crashed and died.
>This happened before, a few months ago, and I shrugged it off -
>sometimes sh1t happens.
>Then it happened last week again. This time I was concerned.
>Now it has just happened again, about 40 minutes ago.
>I tried to restart hobbit, without much luck, then I walked away, put
>my son into bed, and then tried again.
>This time it worked.
>The logs never showed anything conclusive, but maybe I just don't know
>what I am looking for.
>The symptoms were the same all three times.
>All "passive" server based tests go purple.
>By passive server based, I mean conn, http, content, ssh, ftp, ftps, etc.
>The tests that do not rely on a client.
>Also went purple, was bbd and bbtest.
>All client based tests were unaffected. Graphing worked as normal. And
>alerts were being sent out.
Your description sounds very much as if the only thing that stopped were the network tests (bbtest-net). Since the client-side tests are updating, network tests go purple and alerts go out, I think that is where the problem is. "bbtest" going purple also points in this direction.
Next time it happens, see if there's a "bbtest-net" process running (and possible a "hobbitping" or "fping" process as well); if there is, kill it with a "kill -6"
to make it dump core. Then do the usual stuff of getting a stacktrace from the core file ( http://www.hswn.dk/hobbit/help/known-issues.html#bugreport )
Are you running bbtest-net with the "--no-ares" option ? Then a hung/slow DNS server can make your network tests run very slowly.
Henrik
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
NOTICE: This email and any attachments are confidential.
They may contain legally privileged information or
copyright material. You must not read, copy, use or
disclose them without authorisation. If you are not an
intended recipient, please contact us at once by return
email and then delete both messages and all attachments.
More information about the Xymon
mailing list