<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<HTML><HEAD>

<META http-equiv=Content-Type content="text/html; charset=windows-1252">

<META content="MSHTML 6.00.2900.5626" name=GENERATOR>

<STYLE></STYLE>

</HEAD>

<BODY bgColor=#ffffff>

<DIV><FONT size=2>Hi guys,</FONT></DIV>

<DIV><FONT size=2></FONT> </DIV>

<DIV><FONT size=2>(Bug report and fix submitted here as Sourceforge looks 'not 

particularly' active)</FONT></DIV>

<DIV><FONT size=2></FONT> </DIV>

<DIV><FONT size=2>I've just started setting up a Hobbit system to monitor a load 

of Windows boxes (using BBWin), and am implementing our custom tests using 

external script mechanism. Once I finally got my head around the Hobbit/BBWin 

interface and worked out that it's really simple to implement, just very 

confusing to find the right document to look at, the test columns and graphs 

were displayed fine, but with dodgy data in the graphs.</FONT></DIV>

<DIV><FONT size=2></FONT> </DIV>

<DIV><FONT size=2>The problem though is that RRD only intermittently 

gets its updates - maybe once every 15-20 minutes. I eventually realised that 

it's a problem with the update caching implemented in hobbitd_rrd. This is using 

the snapshot version as of 2008/Aug/02. The bug does not apply to version 4.2.0 

from SourceForge.</FONT></DIV>

<DIV><FONT size=2></FONT> </DIV>

<DIV><FONT size=2>The majoirty of the internal tests cache RRD updates using 

static data held in do_rrd.c (v1.61 2008/04/02). External scripts though are 

handled by forking to a child of the hobbitd_rrd process in do_external.c (v1.22 

2008/03/22) - I assume to avoid a midbehaving user script from snarling up the 

whole system. Once the data is collected from the script it then passes it on to 

RRD in the normal way. However, the forked process uses a copy of the static 

data, so this goes into a different cache to that in the main process. And once 

done the child process goes away - without forcing the cache to empty. 

Following this logic, the cache never fills up enough to flush itself, and so 

the data don't make it to RRD (which rather begs the question of how I got 

anything in the graph at all - but then that's a side issue).</FONT></DIV>

<DIV><FONT size=2></FONT> </DIV>

<DIV><FONT size=2>The obvious solutions appear to be:</FONT></DIV>

<DIV><FONT size=2>1) don't fork to a child - but that would allow misbehaving 

scripts to hang the system</FONT></DIV>

<DIV><FONT size=2>2) fork, but pass the data back to the parent process once 

it's done - possible, but not a trivial fix</FONT></DIV>

<DIV><FONT size=2>3) fork as currently, but flush the cache before closing the 

child process - not particularly elegant, but simple to implement.</FONT></DIV>

<DIV><FONT size=2></FONT> </DIV>

<DIV><FONT size=2>I've implemented a fix of type 3. It's important to only flush 

what is handled by the external script handler, as the parent process will still 

have it's copy of the cache at the time of the fork, and will flush that 

itself in the normal course of events. There is a function in do_rrd.c that 

allows a partial flush of the cache - rrdcacheflushhost(). This flushes 

everything that matches the supplied "hostname", which can be the full path 

to the RRD archive, or a leading substring thereof. If it is only external 

scripts supplying the test data, then no keys matching that test name will ever 

be held in the parent process cache, so this path can be used as a key to 

flush the cache prior to exiting the child. The name of the repository (RRD 

file) is held within the do_rrd module context as the static string "rrdfn", 

which is accessible to the worker functions. This is used in the following fix 

to generate the match string - it's a bit ugly but it works.</FONT></DIV>

<DIV><FONT size=2></FONT> </DIV>

<DIV><FONT size=2>So, in <FONT size=2>do_external.c,v 1.22 2008/03/22 

07:48:55</FONT></FONT></DIV>

<DIV><FONT size=2>in function do_external_rrd()</FONT></DIV>

<DIV><FONT size=2>    declare a char * variable called 

extkey </FONT></DIV>

<DIV><FONT size=2>    then after line 106 (within the R_DATA 

case) : create_and_update_rrd(hostname, testname, classname, pagepaths, params, 

NULL);</FONT></DIV>

<DIV><FONT size=2>insert</FONT></DIV>

<DIV><FONT size=2>    extkey = (char 

*)malloc(strlen(hostname) + strlen(rrdfn) + 

3*sizeof(char));<BR>    if( extkey ) 

{<BR>        sprintf(extkey, "/%s/%s", 

hostname, 

rrdfn);<BR>        dbgprintf("%09d : 

Forcing flush of '%s'\n", extkey 

);<BR>        rrdcacheflushhost(extkey);<BR>        xfree(extkey);<BR>    }</FONT></DIV>

<DIV><FONT size=2><BR>This is now working reliably for me.</FONT></DIV>

<DIV><FONT size=2></FONT> </DIV>

<DIV><FONT size=2>If the external script is used to feed additional data into 

one of the internal test repositories this fix will fail - with that internally 

generated data being written both by the parent and the child. A work-around for 

that would be to make a similar rrdcacheflushhost() call prior to the fork, so 

clearing out any such entries from the parent, and then the child can write out 

only the data it generated itself EXCEPT for the fact that we haven't worked out 

what rrdfn is by that time. Another alternative would be to put in a switch to 

temporarily prevent the caching mechanism inside create...rrd. The simplest 

though is...just don't do it!</FONT></DIV>

<DIV><FONT size=2></FONT> </DIV>

<DIV><FONT size=2>As an additional observation, the rrd-status.log shows that at 

or around the termination of the child process the message pipe receives an 

EINTR completion, then loops around and restarts the message wait. I've no idea 

whether this is to be expected - although it looks a bit odd. I've not done 

much *NIX IPC development though, so I'll leave that one to the 

experts.</FONT></DIV>

<DIV><FONT size=2></FONT> </DIV>

<DIV><FONT size=2>Graham Nayler</FONT></DIV>

<DIV><FONT size=2><A 

href="http://www.hallmarq.net">www.hallmarq.net</A></FONT></DIV>

<DIV><FONT face="Arial, sans-serif"><FONT style="FONT-SIZE: 8pt" size=1><FONT 

color=#8b8b8b><BR></FONT></FONT></FONT> </DIV></BODY></HTML>