[hobbit] RC client release bug?

David Gore David.Gore at verizonbusiness.com
Fri Jul 14 01:12:03 CEST 2006



Henrik Stoerner wrote:
> On Thu, Jul 13, 2006 at 07:09:11PM +0000, David Gore wrote:
>   
>> We have seen this with recent snapshots and the latest release candidate 
>> client.  logfetch hangs which causes the client to hang and go purple 
>> for all the tests.  It can be resolved by killing logfetch and deleting 
>> all the entries in ~/client/tmp.  We could try to be more surgical on 
>> the deleting of files.  This has happened on two very independent hosts 
>> running Solaris 8, one being a SunFire 880 and another being an E4500/E5500.
>>
>> Suggestions?  It can run for many days before hanging.
>>     
>
> That's obviously interesting.
>
> When it hangs, is it just dead ? Or is it hogging the cpu (as it would
> do if it were in a tight loop somewhere in the code) ?
>
>   
CPU hogging, yes.
> The hosts you monitor where this happens ... what kind of entries in 
> client-local.cfg do you have for them ? Any "dir" entries, for instance?
> Those do run an external program (du), which is always something that
> is harder to control.
>
>   
No "dir" entries, just "file" and "log".
> When it happens again, could you please try and kill it with a 
> "kill -ABRT <logfetchPID>" ? That should cause it to dump core,
> and it will be much easier to see where it hangs with a core
> dump. Once you have the core dump, running it through gdb as described
> in the Help->Known Problems->How to report bugs will give me much
> more to work on.
>
>
>   
Might take a few days, but we will certainly do that and see what it 
shows.  As always thank you for the hard work!
> Regards,
> Henrik
>
>
> To unsubscribe from the hobbit list, send an e-mail to
> hobbit-unsubscribe at hswn.dk
>
>
>   




More information about the Xymon mailing list