[hobbit] Graphs stop update 24 hours after client reboot; start again 24 hours later - quick fix
Brand, Thomas R.
TRBrand at cvs.com
Thu Oct 1 17:39:54 CEST 2009
> -----Original Message-----
> From: Patrik Nilsson [mailto:patrik at jalbum.net]
> Sent: Wednesday, September 30, 2009 9:51 AM
> To: hobbit at hswn.dk
> Subject: Re: [hobbit] Graphs stop update 24 hours after client reboot;
> start again 24 hours later.
>
> Returning to this old thread as I ran into this issue today.
>
> Wed, 28 Jan 2009 12:23:17 +0000 (UTC), Henrik wrote:
> >"Brand, Thomas R." <TRBrand (at) cvs.com> writes:
> >>I need some help/suggestions to figure out why my "cpu load" and
"users
> >>& processes" graphs stop updating about 24 hours after the systems
> >>reboot. The updates stop for anywhere from 12 to 24 hours, then
simply
> >>start back up again.
> >>Only the "CPU load" and the "Users and Processes" graphs are having
the
> >>problem; disk, memory, cpu utilization, network traffic don't miss a
> >>beat.
>
> >The only explanation I can come up with is that the format of
> >some of the "cpu" status message is different for the first 24 hours
> >after a reboot.
> >Could you send me an example of the cpu status shortly after a
reboot,
> >and one when the graphs are working ?
> >What OS are these boxes ?
>
> Running openSUSE 11.1 (x86_64).
>
> Client output that does not update the rrd:
>
> [top]
> top - 14:39:44 up 1 day, 4:40, 3 users, load average: 2.42, 2.88,
2.89
>
> Client output that does update the rrd:
>
> [top]
> top - 14:42:51 up 40 days, 2:41, 3 users, load average: 4.19, 3.61,
> 3.10
>
> The only difference I can see is "day" instead of "days".
>
> Regards,
>
> Patrik
>
Based on Patrik's observation, I tried a few more things and found that
'top' does not appear to be the problem; however, on SuSE Linux 10.x
'uptime' also uses 'day' vs. 'days' and it is this value that causes the
la.rrd graphs to lose the info.
As a quick-fix, I have modified hobbitclient-linux.sh on my SuSE Linux
10.x
systems as follows;
echo "[uptime]"
uptime | perl -pe "s/^(.*) day (.*)/\1 days \2/"
The graphs updated on the next polling interval and started displaying
the missing information.
Thanks for pointing out the way Patrik :)
Now if someone can figure out where and what needs to be updated in the
source code -- that's a bit beyond my skills...
Cheers
Tom Brand
More information about the Xymon
mailing list