[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [hobbit] Weird disk alert with bad data
- To: hobbit (at) hswn.dk
- Subject: Re: [hobbit] Weird disk alert with bad data
- From: Stef Coene <stef.coene (at) docum.org>
- Date: Tue, 28 Oct 2008 21:10:17 +0100
- References: <081028.125930.EDT.URMM (at) vm.marist.edu>
- User-agent: KMail/1.9.10
On Tuesday 28 October 2008, Martha McConaghy wrote:
> We recently got the AIX client working with our Hobbit server. I then
> had to apply a patch to rrd/do_vmstat.c to fix a problem with rrd crashing
> due to an uninitialized variable coming from the AIX client. Despite that,
> I'm still seeing a weird problem. One of the other non-AIX clients will
> have their disk check to to red alert. When, I take a look at it, the
> disks are fine. However, the data being processed by rrd is off by a few
> characters which seems to be what is causing the red alert to be generated.
> It will last for an hour or so, then will go green again and the problem
> will move to a different non-AIX client. When I remove the three AIX
> clients from bb-hosts, the problem disappears. So, it seems to be pretty
> clearly related to the AIX client, though is affecting other alerts.
>
> Any thoughts on what to do? Have we stumbled onto another bug?
What patch did you applied for the rrd?
I have lots of AIX client talking to lots of hobbit servers and I never had a
problem with the rrds. The only patch I applied regarding vmstat is adding
cpu_pc and cpu_ec and striping of . and , of the numbers.
My vmstat patch:
--- ./hobbit-4.2.0/hobbitd/rrd/do_vmstat.c 2006-08-09 22:10:06.000000000
+0200
+++ ./hobbit-4.2.0-OK/hobbitd/rrd/do_vmstat.c 2007-03-13 11:40:39.000000000
+0100
@@ -76,6 +76,8 @@
{ 14, "cpu_sys" },
{ 15, "cpu_idl" },
{ 16, "cpu_wait" },
+ { 17, "cpu_pc" },
+ { 18, "cpu_ec" },
{ -1, NULL }
};
@@ -322,6 +324,17 @@
p = strchr(datapart, '\n'); if (p) *p = '\0';
p = strtok(datapart, " "); datacount = 0;
while (p && (datacount < MAX_VMSTAT_VALUES)) {
+
+ /* Removing . and , from the numbers */
+ char *p1;
+ while ( (p1 = strchr(p,'.')) != NULL ) {
+ strcpy (p1, p1+1) ;
+ }
+ char *p2;
+ while ( (p2 = strchr(p,',')) != NULL ) {
+ strcpy (p2, p2+1) ;
+ }
+
values[datacount++] = atoi(p);
p = strtok(NULL, " ");
}
Stef