[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [hobbit] hobbitlaunch segfault / timewarp happend again
- To: "Dugan, Darin D [EIT]" <hobbit (at) hswn.dk>
- Subject: Re: [hobbit] hobbitlaunch segfault / timewarp happend again
- From: Alexander Keller <hobbit (at) alexkeller.de>
- Date: Sun, 13 Apr 2008 20:51:55 +0200
- References: <17055.AVAMW1tEFgM=.1204884387.squirrel (at) webmailer.hosteurope.de> <371284689.20080321162149 (at) alexkeller.de> <ADFE09B14D3F3A408FF655E276E486FC0C8D65C3 (at) buster.exnet.iastate.edu>
Hi,
looks great. I applied your patch on a test system. So far it works
perfect for me.
It would be great if Henrik could apply your patch.
Thanks!
Alexander
> I recently brought up a new client that has trouble keeping accurate
> time...so I began encountering this time warp issue. As pointed out by
> Henrik in January, it is definitely an infinite loop where errprintf()
> calls getcurrenttime() to get its timestamp.
> The attached patch modifies the functions in errormsg.c to use time()
> instead of getcurrenttime(). That avoids any recursion-infinite loop
> problems, and logs or prints errors with the system's actual time
> instead of a Hobbit-adjusted-for-sanity time. In the absence of accurate
> time, I think it would be best to log in the system's time so that you
> can correlate Hobbit logs with other system logs.
> Working for me so far, but use at your own risk. Comments?
> Cheers.
> Darin Dugan
> dddugan (at) iastate.edu
> -----Original Message-----
> From: Alexander Keller [mailto:hobbit (at) alexkeller.de]
> Sent: Friday, March 21, 2008 10:22 AM
> To: hobbit (at) hswn.dk
> Subject: Re: [hobbit] hobbitlaunch segfault / timewarp happend again
> Hi,
> unfortunately nobody answered to my posting, so I did a quick'n dirty
> hack to prevent timewarp segfaults in hobbitlaunch.
> Just comment out the errprintf-statement in lib/timefunc.c:
> if (timewarphappened) {
> /*
> * Tell the world about it.
> * Must do this AFTER changing timewarp and lastresult,
> * or we will start an endless loop triggering a stack
> * overflow because errprintf() calls getcurrenttime().
> */
> /*
> * **** prevent segfault: do not log time warp. ****
> * errprintf("Time warp detected: Adjusting returned clock by
> %d seconds\n", timewarp);
> */
> }
> This is not a real solution, but it works for me. Maybe there is
> somebody out, who can fix this issue properly
> Regards
> Alexander
>> Hi Henrik,
>> in january I reported a segfault with hobbitlaunch/timefunc.c. You
> quickly
>> provided a patch...
>> Now I'm having a new error - see core dump:
>> /opt/hobbit/client# gdb bin/hobbitlaunch core
>> GNU gdb 6.4-debian
>> Copyright 2005 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License, and
> you are
>> welcome to change it and/or distribute copies of it under certain
> conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB. Type "show warranty" for
> details.
>> This GDB was configured as "i486-linux-gnu"...Using host libthread_db
>> library "/lib/tls/i686/cmov/libthread_db.so.1".
>> Core was generated by `./bin/hobbitlaunch
> --config=./etc/clientlaunch.cfg
>> --log=./logs/clientlaunch.lo'.
>> Program terminated with signal 11, Segmentation fault.
>> warning: Can't read pathname for load map: Input/output error.
>> Reading symbols from /usr/lib/libz.so.1...done.
>> Loaded symbols for /usr/lib/libz.so.1
>> Reading symbols from /lib/tls/i686/cmov/libc.so.6...done.
>> Loaded symbols for /lib/tls/i686/cmov/libc.so.6
>> Reading symbols from /lib/ld-linux.so.2...done.
>> Loaded symbols for /lib/ld-linux.so.2
>> #0 errprintf (fmt=0x6b <Address 0x6b out of bounds>) at errormsg.c:42
>> 42 time_t now = getcurrenttime(NULL);
>> (gdb) bt
>> #0 errprintf (fmt=0x6b <Address 0x6b out of bounds>) at errormsg.c:42
>> #1 0x0804f125 in getcurrenttime (retparm=0x0) at timefunc.c:73
>> #2 0x0804b9e0 in errprintf (fmt=0x6b <Address 0x6b out of bounds>) at
>> errormsg.c:42
>> #3 0x0804f125 in getcurrenttime (retparm=0x0) at timefunc.c:73
>> #4 0x0804b9e0 in errprintf (fmt=0x6b <Address 0x6b out of bounds>) at
>> errormsg.c:42
>> #5 0x0804f125 in getcurrenttime (retparm=0x0) at timefunc.c:73
>> #6 0x0804b9e0 in errprintf (fmt=0x6b <Address 0x6b out of bounds>) at
>> errormsg.c:42
>> #7 0x0804f125 in getcurrenttime (retparm=0x0) at timefunc.c:73
>> #8 0x0804b9e0 in errprintf (fmt=0x6b <Address 0x6b out of bounds>) at
>> errormsg.c:42
>> #9 0x0804f125 in getcurrenttime (retparm=0x0) at timefunc.c:73
>> #10 0x0804b9e0 in errprintf (fmt=0x6b <Address 0x6b out of bounds>) at
>> errormsg.c:42
>> [...]
>> I can reproduce the error with "ntpdate" using a misconfigured ntp
> server
>> (2 min in the past):
>> 1. start hobbit client "runclient.sh start"
>> 2. sync time with "ntpdate <misconfigured-time-server>"
>> 3. get a core dump
>> Regards,
>> Alexander