[hobbit] RC client release bug?

David Gore David.Gore at verizonbusiness.com
Fri Jul 14 04:15:04 CEST 2006



David Gore wrote:
>
>
> David Gore wrote:
>>
>>
>> David Gore wrote:
>>>
>>>
>>> Henrik Stoerner wrote:
>>>> On Thu, Jul 13, 2006 at 07:09:11PM +0000, David Gore wrote:
>>>>  
>>>>> We have seen this with recent snapshots and the latest release 
>>>>> candidate client.  logfetch hangs which causes the client to hang 
>>>>> and go purple for all the tests.  It can be resolved by killing 
>>>>> logfetch and deleting all the entries in ~/client/tmp.  We could 
>>>>> try to be more surgical on the deleting of files.  This has 
>>>>> happened on two very independent hosts running Solaris 8, one 
>>>>> being a SunFire 880 and another being an E4500/E5500.
>>>>>
>>>>> Suggestions?  It can run for many days before hanging.
>>>>>     
>>>>
>>>> That's obviously interesting.
>>>>
>>>> When it hangs, is it just dead ? Or is it hogging the cpu (as it would
>>>> do if it were in a tight loop somewhere in the code) ?
>>>>
>>>>   
>>> CPU hogging, yes.
>>>> The hosts you monitor where this happens ... what kind of entries 
>>>> in client-local.cfg do you have for them ? Any "dir" entries, for 
>>>> instance?
>>>> Those do run an external program (du), which is always something that
>>>> is harder to control.
>>>>
>>>>   
>>> No "dir" entries, just "file" and "log".
>>>> When it happens again, could you please try and kill it with a 
>>>> "kill -ABRT <logfetchPID>" ? That should cause it to dump core,
>>>> and it will be much easier to see where it hangs with a core
>>>> dump. Once you have the core dump, running it through gdb as described
>>>> in the Help->Known Problems->How to report bugs will give me much
>>>> more to work on.
>>>>
>>>>
>>>>   
>>> Might take a few days, but we will certainly do that and see what it 
>>> shows.  As always thank you for the hard work!
>>
>> Sooner than I expected, here is the backtrace:
>>
>> GNU gdb 6.0
>> Copyright 2003 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License, and 
>> you are
>> welcome to change it and/or distribute copies of it under certain 
>> conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB.  Type "show warranty" for 
>> details.
>> This GDB was configured as "sparc-sun-solaris2.8"...
>> Core was generated by `/export/home/nmsbb/client/bin/logfetch 
>> /export/home/nmsbb/client/tmp/logfetch.o'.
>> Program terminated with signal 6, Aborted.
>> Reading symbols from /usr/lib/libc.so.1...done.
>> Loaded symbols for /usr/lib/libc.so.1
>> Reading symbols from /usr/lib/libdl.so.1...done.
>> Loaded symbols for /usr/lib/libdl.so.1
>> Reading symbols from 
>> /usr/platform/SUNW,Sun-Fire-880/lib/libc_psr.so.1...done.
>> Loaded symbols for /usr/platform/SUNW,Sun-Fire-880/lib/libc_psr.so.1
>> #0  0xff3906e8 in memcpy () from 
>> /usr/platform/SUNW,Sun-Fire-880/lib/libc_psr.so.1
>> (gdb) bt
>> #0  0xff3906e8 in memcpy () from 
>> /usr/platform/SUNW,Sun-Fire-880/lib/libc_psr.so.1
>> #1  0x00012e10 in logdata (filename=0xffbef5a0 "", logdef=0x38738, 
>> truncated=0xffbef6c4)
>>    at logfetch.c:192
>> #2  0x000142f4 in main (argc=215040, argv=0x34c00) at logfetch.c:844
>>
>> I took a look at one of my co-workers entries in client-local.cfg:
>>
>> ignore DEBUG|WARN|^at.*)$
>>
>> I put a back slash in front of the left paren:
>>
>> ignore DEBUG|WARN|^at.*\)$
>>
>> Perhaps that may have been why it was hanging?
>>
>>
>>
> Ok, so that did not work, here some horrible stats by the way:
>
>  1889 nmsbb      1   0    0   33M   33M cpu/0  159:58 24.89% logfetch
> 24868 nmsbb      1   0    0 7144K 6976K cpu/3   34:16 24.88% logfetch
>
> Not good.
Sorry, should have included this:

(gdb) bt
#0  0xff3906e8 in memcpy () from 
/usr/platform/SUNW,Sun-Fire-880/lib/libc_psr.so.1
#1  0x00012e10 in logdata (filename=0xffbef5a8 "", logdef=0x38738, 
truncated=0xffbef6cc)
    at logfetch.c:192
#2  0x000142f4 in main (argc=215040, argv=0x34c00) at logfetch.c:844

>
>>>> Regards,
>>>> Henrik
>>>>
>>>>
>>>> To unsubscribe from the hobbit list, send an e-mail to
>>>> hobbit-unsubscribe at hswn.dk
>>>>
>>>>
>>>>   
>>>
>>>
>>> To unsubscribe from the hobbit list, send an e-mail to
>>> hobbit-unsubscribe at hswn.dk
>>>
>>>
>>
>




More information about the Xymon mailing list