[Xymon] xymon hostdata module going rogue

John Thurston john.thurston at alaska.gov
Tue Dec 1 22:41:26 CET 2015


On 12/1/2015 11:51 AM, J.C. Cleaver wrote:
> On Tue, December 1, 2015 9:32 am, John Thurston wrote:
> *snip*
>
>> In this occurrence, it does not appear to be related to a "drop"
>> message. My last recorded "drop" was at 20151103-0846 and the alert
>> process didn't start logging "which is not defined" until 20151120-0007
>
> Hmm. Okay, that does change things slightly. Fortunately, that means it's
> probably specifically caused by drops per se. Were there any other errors
> that occurred with other components around this time?

I have several instances of "Oversize status msg from " in the 
xymond.log, but those are appearing six hours before the bad behavior 
appeared in xymon_alert. I have difficulty believing they are related.

> Perhaps the system
> being low enough on memory that some re-allocations might have failed?

I think this is unlikely. The system has 256GB of RAM, and there are no 
memory caps placed on the non-global zone in which xymon is running. I 
don't have information of its size on Nov 20, but today it using about 
400MB of RAM. All of the zones on the system are consuming less than 
10GB of the 256GB and it wouldn't have been significantly different a 
few weeks ago.

I've been doing some 'drops' today to try to break it, but haven't 
succeeded. I'll continue to beat on it and see if I can find a 
repeatable failure scenario.

fwiw, this is under 4.3.22
-- 
    Do things because you should, not just because you can.

John Thurston    907-465-8591
John.Thurston at alaska.gov
Enterprise Technology Services
Department of Administration
State of Alaska



More information about the Xymon mailing list