[hobbit] RC client release bug?

Henrik Stoerner henrik at hswn.dk
Thu Jul 13 22:15:32 CEST 2006


On Thu, Jul 13, 2006 at 07:09:11PM +0000, David Gore wrote:
> 
> We have seen this with recent snapshots and the latest release candidate 
> client.  logfetch hangs which causes the client to hang and go purple 
> for all the tests.  It can be resolved by killing logfetch and deleting 
> all the entries in ~/client/tmp.  We could try to be more surgical on 
> the deleting of files.  This has happened on two very independent hosts 
> running Solaris 8, one being a SunFire 880 and another being an E4500/E5500.
> 
> Suggestions?  It can run for many days before hanging.

That's obviously interesting.

When it hangs, is it just dead ? Or is it hogging the cpu (as it would
do if it were in a tight loop somewhere in the code) ?

The hosts you monitor where this happens ... what kind of entries in 
client-local.cfg do you have for them ? Any "dir" entries, for instance?
Those do run an external program (du), which is always something that
is harder to control.

When it happens again, could you please try and kill it with a 
"kill -ABRT <logfetchPID>" ? That should cause it to dump core,
and it will be much easier to see where it hangs with a core
dump. Once you have the core dump, running it through gdb as described
in the Help->Known Problems->How to report bugs will give me much
more to work on.


Regards,
Henrik




More information about the Xymon mailing list