[hobbit] hobbitd_larrd is crashing

Larry Barber lebarber at gmail.com
Sat Jun 10 00:14:10 CEST 2006


After applying the second patch, it's still crashing, stacktrace:

(gdb) backtrace
#0  0x00d8960a in do_lookup_versioned () from /lib/ld-linux.so.2
#1  0x00d88776 in _dl_lookup_versioned_symbol_internal () from /lib/ld-
linux.so.2
#2  0x00d8c473 in fixup () from /lib/ld-linux.so.2
#3  0x00d8c330 in _dl_runtime_resolve () from /lib/ld-linux.so.2
#4  0x08054c89 in sigsegv_handler (signum=11) at sig.c:51
#5  <signal handler called>
#6  0x0039078b in strlen () from /lib/tls/libc.so.6
#7  0x0035e621 in vfprintf () from /lib/tls/libc.so.6
#8  0x0037fd24 in vsnprintf () from /lib/tls/libc.so.6
#9  0x08050fb3 in errprintf (fmt=0x8057cd8 "RRD error creating %s: %s\n") at
errormsg.c:51
#10 0x0804a93a in create_and_update_rrd (hostname=0x7 <Address 0x7 out of
bounds>,
    fn=0x805f6e0
"tcp.http.https:,,pws.sc.egov.usda.gov,siteminderagent,dmsforms,login_banner.fcc?TYPE=33554433&REALMOID=06-d3b2e2ae-78ac-495d-a153-09f36b6aa237&GUID=&SMAUTHREASON=0&METHOD=GET&SMAGENTNAME=$SM$2z10ILc8e"...,
creparams=0x805e5c0, template=0x9098e68 "sec") at do_rrd.c:145
#11 0x0804f2af in do_net_rrd (hostname=0xb7560037 "FS_PVHOST",
testname=0xb7560041 "http",
    msg=0xb756006f "status FS_PVHOST.http green Fri Jun  9 17:11:00 2006: OK
; OK ; OK\n\n&green http://poc.fs.usda.gov/wps/portal - OK\n\nHTTP/1.1 200
OK\r\nDate: Fri, 09 Jun 2006 22:11:57 GMT\r\nServer:
IBM_HTTP_Server/2.0.47."..., tstamp=1149891084) at rrd/do_net.c:50
#12 0x08050266 in update_rrd (hostname=0xb7560037 "FS_PVHOST",
testname=0xb7560041 "http",
    msg=0xb756006f "status FS_PVHOST.http green Fri Jun  9 17:11:00 2006: OK
; OK ; OK\n\n&green http://poc.fs.usda.gov/wps/portal - OK\n\nHTTP/1.1 200
OK\r\nDate: Fri, 09 Jun 2006 22:11:57 GMT\r\nServer:
IBM_HTTP_Server/2.0.47."..., tstamp=1149891084, sender=0x706a4266 <Address
0x706a4266 out of bounds>, ldef=0x706a4266) at do_rrd.c:293
#13 0x08049cf0 in main (argc=1886012006, argv=0xbfffbab4) at
hobbitd_rrd.c:199


BTW, those ultra-long URL's have been in there for quite a while, several
months anyway.

Thanks,
Larry Barber


On 6/9/06, Larry Barber <lebarber at gmail.com> wrote:
>
> No joy, it is still crashing, stack trace:
>
> (gdb)
> #0  0x0046260a in do_lookup_versioned () from /lib/ld-linux.so.2
> #1  0x00461776 in _dl_lookup_versioned_symbol_internal () from /lib/ld-
> linux.so.2
> #2  0x00465473 in fixup () from /lib/ld-linux.so.2
> #3  0x00465330 in _dl_runtime_resolve () from /lib/ld-linux.so.2
> #4  0x08054c79 in sigsegv_handler (signum=11) at sig.c:51
> #5  <signal handler called>
> #6  0x004623da in do_lookup () from /lib/ld-linux.so.2
> #7  0x00461103 in _dl_lookup_symbol_internal () from /lib/ld-linux.so.2
> #8  0x0046540f in fixup () from /lib/ld-linux.so.2
> #9  0x00465330 in _dl_runtime_resolve () from /lib/ld-linux.so.2
> #10 0x0804a92b in create_and_update_rrd (hostname=0x7 <Address 0x7 out of
> bounds>,
>     fn=0x805f6e0 "tcp.http.https:,,pws.tc.sc.egov.usda.gov,siteminderagent,dmsforms,login_banner.fcc?TYPE=33554433&REALMOID=06-d38f4375-a8bd-4190-b6f9-3c77f0901647&GUID=&SMAUTHREASON=0&METHOD=GET&SMAGENTNAME=$SM$hIspF3"...,
> creparams=0x805e5c0, template=0x93f7b20 "sec") at do_rrd.c:145
> #11 0x0804f2a0 in do_net_rrd (hostname=0xb755f036
> "stellent_pre-prod_v-ip", testname=0xb755f04d "http",
>     msg=0xb755f07b "status stellent_pre-prod_v-ip.http green Fri Jun  9
> 16:53:40 2006: OK ; OK\n\n&green https://pws.tc.sc.egov.usda.gov/siteminderagent/dmsforms/login_banner.fcc?TYPE=33554433&REALMOID=06-d38f4375-a8bd-419
> "..., tstamp=1149890052) at rrd/do_net.c:48
> #12 0x08050256 in update_rrd (hostname=0xb755f036
> "stellent_pre-prod_v-ip", testname=0xb755f04d "http",
>     msg=0xb755f07b "status stellent_pre-prod_v-ip.http green Fri Jun  9
> 16:53:40 2006: OK ; OK\n\n&green https://pws.tc.sc.egov.usda.gov/siteminderagent/dmsforms/login_banner.fcc?TYPE=33554433&REALMOID=06-d38f4375-a8bd-419
> "..., tstamp=1149890052, sender=0x1ca3f <Address 0x1ca3f out of bounds>,
> ldef=0x1ca3f) at do_rrd.c:293
> #13 0x08049cf0 in main (argc=117311, argv=0xbfffab14) at hobbitd_rrd.c:199
>
>
> I was looking at your patch, and it doesn't look to me like that new lines
> are doing the same thing as the old:
>
> -	strcat(filedir, "/"); strcat(filedir, fn);
> +	snprintf(filedir, sizeof(filedir)-1, "%s/%s/%s", rrddir, hostname, fn);
> +	filedir[sizeof(filedir)-1] = '\0';
>  	creparams[1] = filedir;	/* Icky */
>
>
> It looks like the original line creates something like "filedir/fn" while the new lines create something like "filedir/hostname/fn". Is this right?
>
> Thanks,
> Larry Barber
>
>
>
> On 6/9/06, Henrik Stoerner <henrik at hswn.dk> wrote:
>
> > On Fri, Jun 09, 2006 at 04:21:56PM -0500, Larry Barber wrote:
> > I loaded p1, and hobbitd_rrd is still dumping, the stack trace looks
> like:
> >
> > #5  <signal handler called>
> > #6  0x00dfe3da in do_lookup () from /lib/ld- linux.so.2
> > #7  0x00dfd103 in _dl_lookup_symbol_internal () from /lib/ld-linux.so.2
> > #8  0x00e0140f in fixup () from /lib/ld-linux.so.2
> > #9  0x00e01330 in _dl_runtime_resolve () from /lib/ld-linux.so.2
> > #10 0x0804a91f in create_and_update_rrd (hostname=0xb755d037
> > "stellent_pre-prod_v-ip",
> >    fn=0x805f6e0
> > "tcp.http.https:,,pws.tc.sc.egov.usda.gov
> ,siteminderagent,dmsforms,login_banner.fcc?TYPE=33554433&REALMOID=06-d38f4375-a8bd-4190-b6f9-3c77f0901647&GUID=&SMAUTHREASON=0&METHOD=GET&SMAGENTNAME=$SM$hIspF3"...,
> > creparams=0x805e5c0, template=0x9cf6b20 "sec") at do_rrd.c:143
>
> OK, the call trace looks sane so I think we can rule out simple memory
> corruption here.
>
> The crash happens when trying to print an error-message from the RRDtool
> library, when trying to create a new RRD file for tracking a http test
> response time (it has just called the rrd_create() function, which returns
> an error and hobbit is trying to print out the error message when it
> crashes.
>
> The filename looks somewhat suspicious. It is generated from the URL
> that is tested, and it is a very long filename beginning with
> "tcp.http.https:,,pws.tc.sc.egov.usda.gov
> ,siteminderagent,dmsforms,login_banner.fcc?TYPE="
> It's an http test for the host "stellent_pre-prod_v-ip"
>
> My guess is that this filename is just too long. It *could* overflow the
> buffer set aside for the RRD filename - in that case, the attached patch
> against 4.1.2p1 should help.
>
>
> > It just started doing this today, I can't think of anything that I have
> done
> > that could cause it.
>
> I think You just added this http test for "stellent_pre-prod_v-ip".
>
>
> Regards,
> Henrik
>
>
>
> To unsubscribe from the hobbit list, send an e-mail to
> hobbit-unsubscribe at hswn.dk
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20060609/080ddb19/attachment.html>


More information about the Xymon mailing list