[Xymon] xymond_history: program crashed

Paul Grondahl paul at arrowtel.net
Fri Mar 27 00:21:19 CET 2015


Thanks for quick response!

ps reveals that xymond_history is indeed running.

There was no crash dump. History log shows:

=> /var/log/xymon/history.log <==
2015-03-25 13:58:29 Peer at 0.0.0.0:0 failed: Broken pipe
2015-03-25 13:59:49 Peer not up, flushing message queue
2015-03-25 14:04:56 Tried to down BOARDBUSY: Invalid argument
2015-03-25 14:05:11 Peer not up, flushing message queue
2015-03-25 14:16:55 Tried to down BOARDBUSY: Invalid argument
2015-03-25 14:17:10 Peer not up, flushing message queue
2015-03-25 14:33:02 Tried to down BOARDBUSY: Invalid argument
2015-03-25 14:33:24 Peer not up, flushing message queue
2015-03-25 14:57:19 Will not update /home/xymon/data/hist/arrow35,ix,arrowtel,net.xymond_history - color unchanged (purple)
2015-03-25 18:04:47 Peer not up, flushing message queue

I think everything’s in order.

Cheers,

-paul


> On Mar 26, 2015, at 3:10 PM, J.C. Cleaver <cleaver at terabithia.org> wrote:
> 
> 
> 
> On Thu, March 26, 2015 1:05 pm, Paul Grondahl wrote:
>> Hi,
>> 
>> First post. Thanks for supporting this fantastic app!
>> 
>> In order to reclaim some disk space I ran:
>> 
>> "su xymon -c '/home/xymon/server/bin/trimhistory --drop --cutoff=`date +%s
>> --date="1 Oct 2013"` —droplogs’"
>> 
>> Afterwards xymond_history went purple, with the message "program crashed -
>> fatal error”
>> 
>> I then ran "xymon localhost "drop <hostname> xymond_history” and now
>> xymond_history has disappeared.
>> 
>> I should add that the first time I ran trimhistory as root which messed up
>> permissions on the allevents file. chmod xymon:xymon appears to have fixed
>> it.
>> 
>> How to get xymond_history working again?
>> 
>> Also, for some long-running hosts, the hostdata directory remains at over
>> 2GB. Is there a way to prune the hostdata directory?
>> 
>> Can I safely delete hostdata with "rm <hostfile>" for hosts that are no
>> longer monitored?
> 
> 
> Welcome! We're glad you like it! :)
> 
> 
> In this case, it seems like xymond_history crashed for some reason while
> the trimhistory script was running; possibly a bug with how we handle
> cases where files disappear underneath us. If a backtrace or core dump
> file was left by the process when it happened, or anything unusual in the
> history.log, it would be very help for us to be able to track things down.
> 
> 
> In terms of "getting xymond_history working again", it should have been
> re-launched right away by xymonlaunch -- you should see it now in your
> 'ps' listing. The current model of sending crashes like this as a dot
> requiring a manual drop more or less ensures a conscious action will be
> taken to acknowledge the issue. xymond_history in its normal operations
> doesn't send any status in, which is why the status you saw eventually
> turned purple in color.
> 
> 
> You may safely delete unneeded files out of the /hostdata/ directory
> without operational impact. (This is probably something that trimhistory
> should take care of, actually.) The only impact would be to someone
> actually trying to read the file from the web pages at that moment, since
> xymond_hostdata doesn't write out to that timestamp after initially saving
> it.
> 
> 
> 
> Regards,
> 
> -jc
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20150326/4b238910/attachment.html>


More information about the Xymon mailing list