[hobbit] hobbitd status-board not available [SOLVED!]
David Gore
David.Gore at mci.com
Thu Oct 13 21:37:31 CEST 2005
I am not sure, if I missed this before I don't think I did, but it's possible.
Regardless the problem has been resolved.
hobbitlaunch.log:2005-10-13 19:01:57 Could not get sem: No space left on device
solaris 9:
/etc/system:
set shmsys:shminfo_shmseg=10
# reboot # or init 6
Everything works well including multi-host enable/disables. No cores since making the change.
Thank you Henrik for all your hard work!
~David
*e-mail via SUSE Linux 9.3 and other open source tools.
David Gore wrote:
> David Gore wrote:
>>
>> Henrik Stoerner wrote:
>>> On Sat, Oct 08, 2005 at 04:08:57PM -0600, David Gore wrote:
>>>
>>>> What does this message mean. Typically we get this when disabling
>>>> multiple hosts. Is it a host resource issue, something isn't
>>>> replying quick enough? We are on the snapshot from 03 October.
>>>> This has been happening over many weeks and different snapshots.
>>>> OS is solaris 9.
>>>>
>>>
>>> It really points to a bug in the hobbitd daemon - it means that some
>>> task (usually bbdisplay) couldn't fetch the status information from
>>> the Hobbit server, which it uses to build the webpages.
>>>
>>> I'm somewhat alarmed if you have this problem with such a recent
>>> snapshot. I know there was a bug in 4.1.1 (and earlier) that could
>>> trigger this when disabling or renaming hosts, but that should not
>>> happen with the snapshot from 03 Oct.
>>>
>>>
>>>> I am pretty sure these happen as people disable hosts and it fails
>>>> although bb2.html shows them going to blue in the history, they
>>>> will not show up on the enable/disable screen and usually show as
>>>> failed when executing the disable.
>>>>
>>>
>>> Interesting. I'll go over that particular piece of code again to
>>> see if I can come up with an explanation. If you have a way of
>>> triggering this, let me know - in that case, I'd like you to try out
>>> some things to make it sure it is fixed.
>>>
>>>
>>> Regards,
>>> Henrik
>>>
>>>
>>> To unsubscribe from the hobbit list, send an e-mail to
>>> hobbit-unsubscribe at hswn.dk
>>>
>>>
>> It is still happening with the latest 4.1.2 install. A multi-host
>> (~75+ hosts) disable worked, but then later on the enable it looks
>> like hobbitd crashed:
>>
>> hobbit at hobbit:/export/home/hobbit/server> find . -name core
>> ./tmp/core
>> hobbit at hobbit:/export/home/hobbit/server> ls -al ./tmp/core
>> -rw------- 1 hobbit other 13630500 Oct 11 16:46 ./tmp/core
>> hobbit at hobbit:/export/home/hobbit/server> file ./tmp/core
>> ./tmp/core: ELF 32-bit MSB core file SPARC Version 1, from 'hobbitd'
>> hobbit at hobbit:/export/home/hobbit/server> gdb bin/hobbitd tmp/core
>> GNU gdb 6.0
>> Copyright 2003 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License, and
>> you are
>> welcome to change it and/or distribute copies of it under certain
>> conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB. Type "show warranty" for
>> details.
>> This GDB was configured as "sparc-sun-solaris2.9"...
>> Core was generated by `hobbitd
>> --pidfile=/export/home/hobbit/server/logs/hobbitd.pid
>> --restart=/export'.
>> Program terminated with signal 6, Aborted.
>> Reading symbols from /usr/lib/libresolv.so.2...done.
>> Loaded symbols for /usr/lib/libresolv.so.2
>> Reading symbols from /usr/lib/libsocket.so.1...done.
>> Loaded symbols for /usr/lib/libsocket.so.1
>> Reading symbols from /usr/lib/libnsl.so.1...done.
>> Loaded symbols for /usr/lib/libnsl.so.1
>> Reading symbols from /usr/lib/libc.so.1...done.
>> Loaded symbols for /usr/lib/libc.so.1
>> Reading symbols from /usr/lib/libdl.so.1...done.
>> Loaded symbols for /usr/lib/libdl.so.1
>> Reading symbols from /usr/lib/libmp.so.2...done.
>> Loaded symbols for /usr/lib/libmp.so.2
>> Reading symbols from
>> /usr/platform/SUNW,Ultra-60/lib/libc_psr.so.1...done.
>> Loaded symbols for /usr/platform/SUNW,Ultra-60/lib/libc_psr.so.1
>> #0 0xff19fff8 in _libc_kill () from /usr/lib/libc.so.1
>> (gdb) bt
>> #0 0xff19fff8 in _libc_kill () from /usr/lib/libc.so.1
>> #1 0xff136cd8 in abort () from /usr/lib/libc.so.1
>> #2 0x00021080 in sigsegv_handler (signum=10) at sig.c:57
>> #3 <signal handler called>
>> (gdb)
>>
>> Can you give me directions on how I can do a relatively clean install
>> and still retain all my historical information?
>>
>> ~David
>>
>> To unsubscribe from the hobbit list, send an e-mail to
>> hobbit-unsubscribe at hswn.dk
>>
>>
> It has cored several times now due to attempted multi-host
> re-enables. I cannot re-enable the hosts. The last time was 5 hosts
> with 1 test. I am just going to let hobbit auto-enable them when
> their disable time expires. Additionally, the disable/enable web page
> is not populated with any hosts for about ten minutes after the crash,
> that includes the info page.
>
> ~David
>
More information about the Xymon
mailing list