[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [hobbit] hobbitd_channel still crashing everyday

To: hobbit (at) hswn.dk
Subject: Re: [hobbit] hobbitd_channel still crashing everyday
From: Henrik Stoerner <henrik (at) hswn.dk>
Date: Thu, 25 Oct 2007 22:42:28 +0200
References: <1d23acab0710231118u74cf98eer1df154dd303ff690 (at) mail.gmail.com> <20071023200234.GD16672 (at) hswn.dk> <BAY138-W11B4D91D708229B6FDDFF59F940 (at) phx.gbl> <961092e10710240715n33d04591geeb37fa388645090 (at) mail.gmail.com> <001d01c8164f$49af93d0$04011818 (at) rr.com> <20071025081623.GA26092 (at) hswn.dk> <001801c8171d$a0de1ae0$04011818 (at) rr.com>
User-agent: Mutt/1.5.15+20070412 (2007-04-11)

On Thu, Oct 25, 2007 at 11:42:19AM -0400, Sean R. Clark wrote:
> Ahh you are correct, my binary + source did not match
> 
> Here is the stack trace from the (correct) binary (it's still crashing)
>   ---- called from signal handler with signal 11 (SIGSEGV) ------
>   [9] main(argc = 4, argv = 0x8046b28), line 678 in "hobbitd_channel.c"

Thanks, the line number isn't quite right, but I think this patch should
fix it. However, it should only happen if the worker process
(hobbitd_alert, hobbitd_rrd, hobbitd_history) cannot keep up with the
flow of incoming messages, so there might be a different problem with 
your setup that triggers this. That would also explain why you see it
regularly, and others do not.

Anyway, let me know if this patch stops it from crashing.

Regards,
Henrik

--- hobbitd/hobbitd_channel.c	2007/09/11 21:20:54	1.60
+++ hobbitd/hobbitd_channel.c	2007/10/25 20:37:46
@@ -648,6 +648,8 @@
 		for (handle = rbtBegin(peers); (handle != rbtEnd(peers)); handle = rbtNext(peers, handle)) {
 			int canwrite = 1, hasfailed = 0;
 			hobbit_peer_t *pwalk;
+			time_t msgtimeout = now - MSGTIMEOUT;
+			int flushcount = 0;
 
 			pwalk = (hobbit_peer_t *) gettreeitem(peers, handle);
 			if (pwalk->msghead == NULL) continue; /* Ignore peers with nothing queued */
@@ -668,18 +670,14 @@
 			}
 
 			/* See if we have stale messages queued */
-			if ((pwalk->msghead->tstamp + MSGTIMEOUT) < now) {
-				/* Stale message at head of queue, flush all that are stale */
-				time_t msgtimeout = now - MSGTIMEOUT;
-				int count = 0;
-
-				while (pwalk->msghead->tstamp < msgtimeout) {
-					flushmessage(pwalk);
-					count++;
-				}
+			while (pwalk->msghead && (pwalk->msghead->tstamp < msgtimeout)) {
+				flushmessage(pwalk);
+				flushcount++;
+			}
 
+			if (flushcount) {
 				errprintf("Flushed %d stale messages for %s:%d\n",
-					  count,
+					  flushcount,
 				  	  inet_ntoa(pwalk->peeraddr.sin_addr), 
 					  ntohs(pwalk->peeraddr.sin_port));
 			}

Follow-Ups:
- RE: [hobbit] hobbitd_channel still crashing everyday
  - From: Sean R. Clark

References:
- Fail over?
  - From: Stewart L
- Re: [hobbit] Fail over?
  - From: Henrik Stoerner
- RE: [hobbit] Fail over?
  - From: T.J. Yang
- Re: [hobbit] Fail over?
  - From: Josh Luthman
- hobbitd_channel still crashing everyday
  - From: Sean R. Clark
- Re: [hobbit] hobbitd_channel still crashing everyday
  - From: Henrik Stoerner
- RE: [hobbit] hobbitd_channel still crashing everyday
  - From: Sean R. Clark

Prev by Date: RE: [hobbit] No graph in Trends but shows up in the column
Next by Date: Strange "problem" with refreshes
Previous by thread: RE: [hobbit] hobbitd_channel still crashing everyday
Next by thread: RE: [hobbit] hobbitd_channel still crashing everyday
Index(es):
- Date
- Thread