[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [xymon] Scheduled disable causes crash?
- To: "xymon (at) xymon.com" <xymon (at) xymon.com>
- Subject: RE: [xymon] Scheduled disable causes crash?
- From: Johan Sjöberg <johan.sjoberg (at) deltamanagement.se>
- Date: Tue, 18 Jan 2011 08:21:54 +0100
- Accept-language: sv-SE
- Acceptlanguage: sv-SE
- References: <B08F3F3D67451844A7A8A029FCC71E4C1589BEFCA3 (at) WIN01.ad.deltamanagement.se> <idii4h$ekd$1 (at) voodoo.hswn.dk>
- Thread-index: AcuVOqi00Dw9fCsJSsaREbk6UZEGoAhpaEvA
- Thread-topic: [xymon] Scheduled disable causes crash?
Hi.
I have not seen this problem since applying the patch. I can't be sure that it's fixed since it didn't happen every time, but it is looking good.
/Johan
> -----Original Message-----
> From: Henrik Størner [mailto:henrik (at) hswn.dk]
> Sent: den 6 december 2010 12:41
> To: xymon (at) xymon.com
> Subject: Re: [xymon] Scheduled disable causes crash?
>
> In
> <B08F3F3D67451844A7A8A029FCC71E4C1589BEFCA3 (at) WIN01.ad.deltamanag
> ement.se> =?iso-8859-1?Q?Johan_Sj=F6berg?=
> <johan.sjoberg (at) deltamanagement.se> writes:
>
> >During the last month, we have had some problems with Xymon when
> using sche=
> >duled disabled (added from the web interface).
>
> >The first problem we had was on September 17th, when hobbitd crashed
> while/=
> >after running the scheduled disable. We got the following error in hobbit.l=
> >og
> >2010-09-17 05:00:00 Fatal error in select: Bad file descriptor
> >2010-09-17 05:00:00 Setup complete
>
> There is a bug lurking in the scheduled-task code, but I haven't been
> able to quite nail down where it is. I've seen the same problem that
> you have a couple of times, where a scheduled "disable" results in
> xymond (hobbitd) crashing immediately afterwards.
>
> One potential bug I did catch is fixed with the following patch:
>
> Index: xymond/xymond.c
> ==========================================================
> =========
> --- xymond/xymond.c (revision 6604)
> +++ xymond/xymond.c (working copy)
> @@ -3971,7 +3971,7 @@
> if (msg->doingwhat == RESPONDING) {
> shutdown(msg->sock, SHUT_RD);
> }
> - else {
> + else if (msg->sock >= 0) {
> shutdown(msg->sock, SHUT_RDWR);
> close(msg->sock);
> msg->sock = -1;
> @@ -5040,6 +5040,8 @@
> swalk
> = swalk->next;
>
>
> memset(&task, 0, sizeof(task));
> +
> task.sock = -1;
> +
> task.doingwhat = NOTALK;
>
> inet_aton(runtask->sender, (struct in_addr *)
> &task.addr.sin_addr.s_addr);
>
> task.buf = task.bufp = runtask->command;
>
> task.buflen = strlen(runtask->command); task.bufsz =
> task.buflen+1;
>
>
> So it would be interesting to see if this helps in your setup. This patch
> is against the current beta-3 code, but it applies to version 4.2.3 as
> well if you run patch and explicitly tell it which file to patch:
>
> patch hobbit-4.2.3/hobbitd/hobbitd.c < task.patch
>
>
> I am not sure if this fixes the problem, though. Because if this is
> what causes the crash, then it ought to happen before the log message
> that the task ran is written. Unless the bug doesn't crash the system
> right away, but only triggers some memory corruption that results in
> a later crash ...
>
>
> Regards,
> Henrik
>
>
> To unsubscribe from the xymon list, send an e-mail to
> xymon-unsubscribe (at) xymon.com
>