[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [xymon] Scheduled disable causes crash?



Hi.

I have not seen this problem since applying the patch. I can't be sure that it's fixed since it didn't happen every time, but it is looking good.

/Johan

> -----Original Message-----
> From: Henrik Størner [mailto:henrik (at) hswn.dk]
> Sent: den 6 december 2010 12:41
> To: xymon (at) xymon.com
> Subject: Re: [xymon] Scheduled disable causes crash?
> 
> In
> <B08F3F3D67451844A7A8A029FCC71E4C1589BEFCA3 (at) WIN01.ad.deltamanag
> ement.se> =?iso-8859-1?Q?Johan_Sj=F6berg?=
> <johan.sjoberg (at) deltamanagement.se> writes:
> 
> >During the last month, we have had some problems with Xymon when
> using sche=
> >duled disabled (added from the web interface).
> 
> >The first problem we had was on September 17th, when hobbitd crashed
> while/=
> >after running the scheduled disable. We got the following error in hobbit.l=
> >og
> >2010-09-17 05:00:00 Fatal error in select: Bad file descriptor
> >2010-09-17 05:00:00 Setup complete
> 
> There is a bug lurking in the scheduled-task code, but I haven't been
> able to quite nail down where it is. I've seen the same problem that
> you have a couple of times, where a scheduled "disable" results in
> xymond (hobbitd) crashing immediately afterwards.
> 
> One potential bug I did catch is fixed with the following patch:
> 
> Index: xymond/xymond.c
> ==========================================================
> =========
> --- xymond/xymond.c	(revision 6604)
> +++ xymond/xymond.c	(working copy)
> @@ -3971,7 +3971,7 @@
>  	if (msg->doingwhat == RESPONDING) {
>  		shutdown(msg->sock, SHUT_RD);
>  	}
> -	else {
> +	else if (msg->sock >= 0) {
>  		shutdown(msg->sock, SHUT_RDWR);
>  		close(msg->sock);
>  		msg->sock = -1;
> @@ -5040,6 +5040,8 @@
>  					swalk
> = swalk->next;
> 
> 
> 	memset(&task, 0, sizeof(task));
> +
> 	task.sock = -1;
> +
> 	task.doingwhat = NOTALK;
> 
> 	inet_aton(runtask->sender, (struct in_addr *)
> &task.addr.sin_addr.s_addr);
> 
> 	task.buf = task.bufp = runtask->command;
> 
> 	task.buflen = strlen(runtask->command); task.bufsz =
> task.buflen+1;
> 
> 
> So it would be interesting to see if this helps in your setup. This patch
> is against the current beta-3 code, but it applies to version 4.2.3 as
> well if you run patch and explicitly tell it which file to patch:
> 
>    patch hobbit-4.2.3/hobbitd/hobbitd.c < task.patch
> 
> 
> I am not sure if this fixes the problem, though. Because if this is
> what causes the crash, then it ought to happen before the log message
> that the task ran is written. Unless the bug doesn't crash the system
> right away, but only triggers some memory corruption that results in
> a later crash ...
> 
> 
> Regards,
> Henrik
> 
> 
> To unsubscribe from the xymon list, send an e-mail to
> xymon-unsubscribe (at) xymon.com
>