[hobbit] wildcards or regex with SPLITNCV

Mon Oct 20 21:38:09 CEST 2008

Graham Nayler wrote:
> Shawn
>
> I'm not using SPLITNCV as I wanted a bit more flexibility in the 
> format of the status report (I wanted to have comment line, single 
> colons not delimiting values etc.), but am using an external script. 
> You may find my earlier reply here
> http://www.hswn.dk/hobbiton/2008/10/msg00159.html useful though.
>
> With the external script mechanism you don't need to restart Hobbit if 
> your test generates additional indexes, only if you add new tests. I'm 
> not entirely sure whether SPLITNCV works the same although it looks OK 
> - but you sound perfectly at home with the source, so have a look at 
> that (do_ncv.c). If you're interested, I attach my parsing script to 
> the end of this - enter the test name list and change the regex for 
> your needs. The commented lines were from when I was using a single 
> RRD file for all indices, but that doesn't give the flexibility of 
> displaying multiple graphs, or adding additional indices.
>
I finally got around to looking at this.  I think I'm even more 
confused.  Not sure where you got the idea I'm comfortable with the 
source ... I've looked at the 4.3 sources trying to get rid of warnings 
and get it working, but the only thing that did was remind me just how 
many years it's been since I did any C programming.  My eyes glazed over 
a bit with your python too, I haven't invested any time in that language 
yet.

I'll start at the beginning and tell everyone what it is I'm trying to 
do and the difficulties I'm facing.  This is an attempt to monitor a 
multi-server EasyAsk search index at a remote data center, to which I 
have no back-end connectivity.  The systems have no way to reach the 
Internet.  I wouldn't have designed it that way; it is an acquisition 
company.  I can log onto them by making an ssh connection to a gateway 
server that has a NIC in the remote LAN, and there are public-facing 
webservers that also can reach it.  We are going to move everything out 
of the data center within the next six months, so I have no plans to 
redesign the network until it is moved to headquarters.

The data is generated by a CGI script running on the public facing 
webserver pair, which an external server-side shell script on Hobbit is 
retrieving with wget.  The CGI script queries the search broker and each 
individual index server.  It notes the total number of records held by 
each index server, adds them all up, and compares that value to the 
number of records reported by the broker.  It also records how long in 
milliseconds each query takes.  It's basically a machine-readable 
rewrite of a script that produces a pretty status page.  We can't watch 
the values on that page 24/7, so I want to graph them to watch for problems.

I didn't go with the external script idea because the RRD docs say it 
doesn't scale well.  I was hoping that NCV or SPLITNCV would handle it 
easily.  I am leery of implementing things that don't scale well - I've 
been bitten in the past because the boss liked what he saw on something 
I'd hacked together without thought to performance and wanted to deploy 
it everywhere.

What I'd like to see is a series of graphs, the first of which should 
have the total count and the broker count, then a graph with just the 
difference.  Then I'd like to have graphs that work like the disk 
graphs, where it aggregates the individual broker counts.  Following 
that, another series of graphs that aggregate the response times.

I think it might be easier to implement and easier to read if have four 
separate columns on the host entry, something like i_totals, i_diff, 
i_counts, and i_time.

Forgetting about scalability, are there good examples for how to 
accomplish this, or a kind soul willing to guide me through the 
process?  I can tweak the CGI script and the script on Hobbit that calls 
it in any way required.