[Xymon] xymon hostdata module going rogue
Scot Kreienkamp
Scot.Kreienkamp at la-z-boy.com
Wed Jun 10 19:44:52 CEST 2015
Scot Kreienkamp | Senior Systems Engineer | La-Z-Boy Corporate
One La-Z-Boy Drive | Monroe, Michigan 48162 | Office: 734-384-6403 | | Mobile: 7349151444 | Email: Scot.Kreienkamp at la-z-boy.com
> -----Original Message-----
> From: J.C. Cleaver [mailto:cleaver at terabithia.org]
> Sent: Wednesday, June 10, 2015 1:21 PM
> To: Scot Kreienkamp
> Cc: xymon at xymon.com
> Subject: Re: xymon hostdata module going rogue
>
>
> | |
> | , | | | | Email: cleaver at terabithia.org
> On Wed, June 10, 2015 10:01 am, Scot Kreienkamp wrote:
> > Hi everyone,
> >
> > I have a xymon server running 4.3.21 that seems to be accumulating
> > processes like these:
> >
> > hobbit 28430 0.0 0.0 0 0 ? Z 12:50 0:00
> > [xymond_hostdata] <defunct>
> > hobbit 28435 0.0 0.0 0 0 ? Z 12:50 0:00
> > [xymond_hostdata] <defunct>
> > hobbit 28440 0.0 0.0 0 0 ? Z 12:50 0:00
> > [xymond_hostdata] <defunct>
> > hobbit 28444 0.0 0.0 0 0 ? Z 12:50 0:00
> > [xymond_hostdata] <defunct>
> > hobbit 28449 0.0 0.0 0 0 ? Z 12:50 0:00
> > [xymond_hostdata] <defunct>
> > hobbit 28452 0.0 0.0 0 0 ? Z 12:50 0:00
> > [xymond_hostdata] <defunct>
> >
> > It seemed related to drop messages, so I did a test.
> >
> >
> > [hobbit at retv6100 temp]$ xymon 127.0.0.1 "drop amds7101_na_lzb_hq" ;
> ps
> > auxw |grep xymond_hostdata |wc -l
> > 161
> > [hobbit at retv6100 temp]$ xymon 127.0.0.1 "drop amds7101_na_lzb_hq" ;
> ps
> > auxw |grep xymond_hostdata |wc -l
> > 162
> > [hobbit at retv6100 temp]$ xymon 127.0.0.1 "drop amds7101_na_lzb_hq" ;
> ps
> > auxw |grep xymond_hostdata |wc -l
> > 163
> > [hobbit at retv6100 temp]$ xymon 127.0.0.1 "drop amds7101_na_lzb_hq" ;
> ps
> > auxw |grep xymond_hostdata |wc -l
> > 164
> > [hobbit at retv6100 temp]$ xymon 127.0.0.1 "drop amds7101_na_lzb_hq" ;
> ps
> > auxw |grep xymond_hostdata |wc -l
> > 165
> > [hobbit at retv6100 temp]$ xymon 127.0.0.1 "drop amds7101_na_lzb_hq" ;
> ps
> > auxw |grep xymond_hostdata |wc -l
> > 166
> > [hobbit at retv6100 temp]$ xymon 127.0.0.1 "drop amds7101_na_lzb_hq" ;
> ps
> > auxw |grep xymond_hostdata |wc -l
> > 167
> >
> > So every time I send a drop message I get a defunct process hanging out.
> > Bug in Xymon?
> >
> > This is on RHEL5, xymon 4.3.21.
> >
> > Thanks!
>
>
> Scot,
>
>
> Some background: When doing a full drop on a host, xymond_hostdata (and
> xymond_history, IIRC) forks to perform the recursive directory removal of
> history files and whatnot in the background, then exits out. That's why it
> corresponds to those events.
>
>
> Looks like xymond_hostdata.c is missing a SIGCHLD registration, which is
> causing the defunct processes to stack up. Strangely, I haven't observed
> this behavior on RHEL6 at all though, even though we're dropping hosts all
> the time. Odd.
>
>
> The following patch should fix the issue for you, I believe.
>
>
> Regards,
>
> -jc
> This message is intended only for the individual or entity to which it is
> addressed. It may contain privileged, confidential information which is
> exempt from disclosure under applicable laws. If you are not the intended
> recipient, you are strictly prohibited from disseminating or distributing this
> information (other than to the intended recipient) or copying this
> information. If you have received this communication in error, please notify
> us immediately by e-mail or by telephone at the above number. Thank you.
Hi JC,
Thanks, but no such luck. I deleted the entire 4.3.21 source tree and expanded it again to make sure I get a pristine source, put the patch in the xymond directory, applied the patch with patch -p1. It applied cleanly so I did a configure, make, make install. I am still getting the defunct processes though. I am not seeing anything in the logs.
This message is intended only for the individual or entity to which it is addressed. It may contain privileged, confidential information which is exempt from disclosure under applicable laws. If you are not the intended recipient, you are strictly prohibited from disseminating or distributing this information (other than to the intended recipient) or copying this information. If you have received this communication in error, please notify us immediately by e-mail or by telephone at the above number. Thank you.
More information about the Xymon
mailing list