Graham Nayler wrote:
I finally got around to looking at this. I think I'm even more confused. Not sure where you got the idea I'm comfortable with the source ... I've looked at the 4.3 sources trying to get rid of warnings and get it working, but the only thing that did was remind me just how many years it's been since I did any C programming. My eyes glazed over a bit with your python too, I haven't invested any time in that language yet.ShawnI'm not using SPLITNCV as I wanted a bit more flexibility in the format of the status report (I wanted to have comment line, single colons not delimiting values etc.), but am using an external script. You may find my earlier reply herehttp://www.hswn.dk/hobbiton/2008/10/msg00159.html useful though.With the external script mechanism you don't need to restart Hobbit if your test generates additional indexes, only if you add new tests. I'm not entirely sure whether SPLITNCV works the same although it looks OK - but you sound perfectly at home with the source, so have a look at that (do_ncv.c). If you're interested, I attach my parsing script to the end of this - enter the test name list and change the regex for your needs. The commented lines were from when I was using a single RRD file for all indices, but that doesn't give the flexibility of displaying multiple graphs, or adding additional indices.
I'll start at the beginning and tell everyone what it is I'm trying to do and the difficulties I'm facing. This is an attempt to monitor a multi-server EasyAsk search index at a remote data center, to which I have no back-end connectivity. The systems have no way to reach the Internet. I wouldn't have designed it that way; it is an acquisition company. I can log onto them by making an ssh connection to a gateway server that has a NIC in the remote LAN, and there are public-facing webservers that also can reach it. We are going to move everything out of the data center within the next six months, so I have no plans to redesign the network until it is moved to headquarters.
The data is generated by a CGI script running on the public facing webserver pair, which an external server-side shell script on Hobbit is retrieving with wget. The CGI script queries the search broker and each individual index server. It notes the total number of records held by each index server, adds them all up, and compares that value to the number of records reported by the broker. It also records how long in milliseconds each query takes. It's basically a machine-readable rewrite of a script that produces a pretty status page. We can't watch the values on that page 24/7, so I want to graph them to watch for problems.
I didn't go with the external script idea because the RRD docs say it doesn't scale well. I was hoping that NCV or SPLITNCV would handle it easily. I am leery of implementing things that don't scale well - I've been bitten in the past because the boss liked what he saw on something I'd hacked together without thought to performance and wanted to deploy it everywhere.
What I'd like to see is a series of graphs, the first of which should have the total count and the broker count, then a graph with just the difference. Then I'd like to have graphs that work like the disk graphs, where it aggregates the individual broker counts. Following that, another series of graphs that aggregate the response times.
I think it might be easier to implement and easier to read if have four separate columns on the host entry, something like i_totals, i_diff, i_counts, and i_time.
Forgetting about scalability, are there good examples for how to accomplish this, or a kind soul willing to guide me through the process? I can tweak the CGI script and the script on Hobbit that calls it in any way required.