[hobbit] DEVMON stops working every now and then
Buchan Milne
bgmilne at staff.telkomsa.net
Thu Nov 12 18:30:51 CET 2009
On Wednesday, 11 November 2009 22:37:56 j.sansford at ntlworld.com wrote:
> We have the same problem - I've even got devmon configured under SMF in
> Solaris however it doesn't pick up the fact its crashed as the process is
> still there.
It doesn't crash. As far as I can tell, eventually all the child processes
lose communication with the master process, but they are all still running,
just waiting for someone to tell them to do something.
> A quick and dirty workaround we have is to send an alert on the "dm"
> monitor going purple - this allows the on-call engineer to be alerted to
> the fact we are no longer effectively monitoring the network devices and so
> to restart the process!
>
> There must be a better way though...
Devmon has had "goes purple" problems since 0.2.2 beta. I fixed the more
frequent one before the 0.3.0 release.
Anyway, I've done some work on this, however the only production instance of
devmon I look at often at present last went purple 9 days ago ...
If you are reproducing more frequently, please have a look at the devmon-devel
mailing list (or archives[1] once they have updated), I just sent a mail with
an attached patch (against svn, it may apply to the 0.3.1-beta1, haven't
tried) that may fix the problem, allow us to narrow it down further, or at
least eliminate one aspect as the cause.
1. http://sourceforge.net/mailarchive/forum.php?forum_name=devmon-devel
Regards,
Buchan
More information about the Xymon
mailing list