[Xymon] xymon hostdata module going rogue

Andy Smith abs at shadymint.com
Sun Aug 30 11:56:26 CEST 2015


J.C. Cleaver wrote:
> On Fri, August 28, 2015 3:16 pm, John Thurston wrote:
>> On 8/28/2015 12:45 PM, John Thurston wrote:
>>> On 6/10/2015 9:01 AM, Scot Kreienkamp wrote:
>>>> Hi everyone,
>>>>
>>>> I have a xymon server running 4.3.21 that seems to be accumulating
>>>> processes like these:
>>>>
>>>> hobbit   28430  0.0  0.0      0     0 ?        Z    12:50   0:00
>>>> [xymond_hostdata] <defunct>
>>>>
>>>> hobbit   28435  0.0  0.0      0     0 ?        Z    12:50   0:00
>>>> [xymond_hostdata] <defunct>
>>>>
>>>> hobbit   28440  0.0  0.0      0     0 ?        Z    12:50   0:00
>>>> [xymond_hostdata] <defunct>
>>>>
>>>> hobbit   28444  0.0  0.0      0     0 ?        Z    12:50   0:00
>>>> [xymond_hostdata] <defunct>
>>>>
>>>> hobbit   28449  0.0  0.0      0     0 ?        Z    12:50   0:00
>>>> [xymond_hostdata] <defunct>
>>>>
>>>> hobbit   28452  0.0  0.0      0     0 ?        Z    12:50   0:00
>>>> [xymond_hostdata] <defunct>
>>>>
>>>> It seemed related to drop messages . . .
>>> Hey, I think I'm seeing the same thing on Solaris with 4.3.21
>>>
>>> I've ended up here after a customer let me know that email alerts were
>>> not working as expected. After a few hours of digging around, I decided
>>> that the alert daemon was failing to retrieve hostnames and failing
>>> miserably.
>>>
>>> Have other people seen this behavior?
>> I have duplicated this behavior on another xymon server on Solaris. It
>> certainly looks like this behavior breaks the alert daemon. Fortunately,
>> I "drop" hosts in batches so can restart Xymon at that time, but this is
>> still pretty icky.
>>
>> J.C., do you know if your patch made it into the code-base?
>>
>> Has anyone else tested this patch? If so, on what operating systems?
>>
>> --
> 
> 
> I thought this had sounded familiar.
> 
> The patch from
> http://lists.xymon.com/pipermail/xymon/2015-June/041833.html was checked
> in in https://sourceforge.net/p/xymon/code/7669/ , however it's not in the
> most recent Terabithia RPM.
> 
> If you could test the direct patch (for hostdata, at
> http://lists.xymon.com/pipermail/xymon/attachments/20150610/8b425efb/attachment.obj
> ) on your OS, that would be very helpful. Signal handling is always a bit
> tricky to ensure is correct across the board.
> 
> 
> Regards,
> 
> -jc

Problem repeated here on Solaris 10, but solved by patch suggested.
-- 
Andy



More information about the Xymon mailing list