[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [hobbit] hobbitlaunch segfault / timewarp happend again



Hi,

looks great. I applied your patch on a test system. So far it works
perfect for me.

It would be great if Henrik could apply your patch.

Thanks!
 Alexander

> I recently brought up a new client that has trouble keeping accurate
> time...so I began encountering this time warp issue. As pointed out by
> Henrik in January, it is definitely an infinite loop where errprintf()
> calls getcurrenttime() to get its timestamp.

> The attached patch modifies the functions in errormsg.c to use time()
> instead of getcurrenttime(). That avoids any recursion-infinite loop
> problems, and logs or prints errors with the system's actual time
> instead of a Hobbit-adjusted-for-sanity time. In the absence of accurate
> time, I think it would be best to log in the system's time so that you
> can correlate Hobbit logs with other system logs.

> Working for me so far, but use at your own risk. Comments?

> Cheers.
> Darin Dugan
> dddugan (at) iastate.edu

> -----Original Message-----
> From: Alexander Keller [mailto:hobbit (at) alexkeller.de] 
> Sent: Friday, March 21, 2008 10:22 AM
> To: hobbit (at) hswn.dk
> Subject: Re: [hobbit] hobbitlaunch segfault / timewarp happend again

> Hi,

> unfortunately nobody answered to my posting, so I did a quick'n dirty
> hack to prevent timewarp segfaults in hobbitlaunch.

> Just comment out the errprintf-statement in lib/timefunc.c:

>   if (timewarphappened) {
>   /*
>    * Tell the world about it.
>    * Must do this AFTER changing timewarp and lastresult,
>    * or we will start an endless loop triggering a stack
>    * overflow because errprintf() calls getcurrenttime().
>    */
>            /*
>            * **** prevent segfault: do not log time warp. ****
>            * errprintf("Time warp detected: Adjusting returned clock by
> %d seconds\n", timewarp);
>            */
>    }

> This is not a real solution, but it works for me. Maybe there is
> somebody out, who can fix this issue properly  

> Regards
>  Alexander


>> Hi Henrik,

>> in january I reported a segfault with hobbitlaunch/timefunc.c. You
> quickly
>> provided a patch...

>> Now I'm having a new error - see core dump:

>> /opt/hobbit/client# gdb bin/hobbitlaunch core
>> GNU gdb 6.4-debian
>> Copyright 2005 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License, and
> you are
>> welcome to change it and/or distribute copies of it under certain
> conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB.  Type "show warranty" for
> details.
>> This GDB was configured as "i486-linux-gnu"...Using host libthread_db
>> library "/lib/tls/i686/cmov/libthread_db.so.1".

>> Core was generated by `./bin/hobbitlaunch
> --config=./etc/clientlaunch.cfg
>> --log=./logs/clientlaunch.lo'.
>> Program terminated with signal 11, Segmentation fault.

>> warning: Can't read pathname for load map: Input/output error.
>> Reading symbols from /usr/lib/libz.so.1...done.
>> Loaded symbols for /usr/lib/libz.so.1
>> Reading symbols from /lib/tls/i686/cmov/libc.so.6...done.
>> Loaded symbols for /lib/tls/i686/cmov/libc.so.6
>> Reading symbols from /lib/ld-linux.so.2...done.
>> Loaded symbols for /lib/ld-linux.so.2
>> #0  errprintf (fmt=0x6b <Address 0x6b out of bounds>) at errormsg.c:42
>> 42              time_t now = getcurrenttime(NULL);
>> (gdb) bt
>> #0  errprintf (fmt=0x6b <Address 0x6b out of bounds>) at errormsg.c:42
>> #1  0x0804f125 in getcurrenttime (retparm=0x0) at timefunc.c:73
>> #2  0x0804b9e0 in errprintf (fmt=0x6b <Address 0x6b out of bounds>) at
>> errormsg.c:42
>> #3  0x0804f125 in getcurrenttime (retparm=0x0) at timefunc.c:73
>> #4  0x0804b9e0 in errprintf (fmt=0x6b <Address 0x6b out of bounds>) at
>> errormsg.c:42
>> #5  0x0804f125 in getcurrenttime (retparm=0x0) at timefunc.c:73
>> #6  0x0804b9e0 in errprintf (fmt=0x6b <Address 0x6b out of bounds>) at
>> errormsg.c:42
>> #7  0x0804f125 in getcurrenttime (retparm=0x0) at timefunc.c:73
>> #8  0x0804b9e0 in errprintf (fmt=0x6b <Address 0x6b out of bounds>) at
>> errormsg.c:42
>> #9  0x0804f125 in getcurrenttime (retparm=0x0) at timefunc.c:73
>> #10 0x0804b9e0 in errprintf (fmt=0x6b <Address 0x6b out of bounds>) at
>> errormsg.c:42
>> [...]

>> I can reproduce the error with "ntpdate" using a misconfigured ntp
> server
>> (2 min in the past):

>> 1. start hobbit client "runclient.sh start"
>> 2. sync time with "ntpdate <misconfigured-time-server>"
>> 3. get a core dump  


>> Regards,
>>  Alexander