[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [hobbit] RC client release bug?





Henrik Stoerner wrote:
On Thu, Jul 13, 2006 at 07:09:11PM +0000, David Gore wrote:
We have seen this with recent snapshots and the latest release candidate client. logfetch hangs which causes the client to hang and go purple for all the tests. It can be resolved by killing logfetch and deleting all the entries in ~/client/tmp. We could try to be more surgical on the deleting of files. This has happened on two very independent hosts running Solaris 8, one being a SunFire 880 and another being an E4500/E5500.

Suggestions? It can run for many days before hanging.

That's obviously interesting.

When it hangs, is it just dead ? Or is it hogging the cpu (as it would
do if it were in a tight loop somewhere in the code) ?

CPU hogging, yes.
The hosts you monitor where this happens ... what kind of entries in client-local.cfg do you have for them ? Any "dir" entries, for instance?
Those do run an external program (du), which is always something that
is harder to control.


No "dir" entries, just "file" and "log".
When it happens again, could you please try and kill it with a "kill -ABRT <logfetchPID>" ? That should cause it to dump core,
and it will be much easier to see where it hangs with a core
dump. Once you have the core dump, running it through gdb as described
in the Help->Known Problems->How to report bugs will give me much
more to work on.



Might take a few days, but we will certainly do that and see what it shows. As always thank you for the hard work!
Regards,
Henrik


To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe (at) hswn.dk