[hobbit] ext scripts & the Hobbit methodology for implementation.

s_aiello at comcast.net s_aiello at comcast.net
Fri Jan 25 00:34:47 CET 2008


On Thursday 24 January 2008, Henrik Stoerner wrote:
> On Thu, Jan 24, 2008 at 10:24:24AM -0500, s_aiello at comcast.net wrote:
> > What I mean by BB legacy ext scripts, are client based ext scripts that
> > send reports in via the bb status command. The Hobbit method is more
> > server centric. In that clients report raw data to the server and the
> > server processes that data, applying thresholds, formating reports, and
> > possibly trending. Now to send raw data from the client there are two
> > choices, data or client. Now in the past, I believe I saw a thread where
> > Henrik mentioned that data was the preferred method for ext scripts &
> > that client was more for the internal use between hobbit clients &
> > server. But the lure of the client command is to great to ignore.
> > Especially with the ability to access that raw data via the clientlog
> > command.
>
> I think this is the real "killer" difference between the client- and the
> data-message types. If the client-data wasn't directly available, you
> could just as well use a data-message.
>
> > So that brings me to my first problem when sending data via the client
> > command the data needs to be structure in a very particular format. Where
> > OSTYPE has to match a predefined Hobbit OSTYPE and a minimum set of
> > SECTIONNAME need to be filled for the Hobbit server to accept the data.
> > If they are missing the data just ends up in /dev/null.
>
> Eh - not quite, you can still get at it with the clientlog command. It
> just doesn't generate any cpu/disk/memory... statuses because
> hobbitd_client - the only program that normally listens in on the Hobbit
> "client" channel - doesn't know how to interpret the data.
>
> The only requirement for a client-message really is that the hostname
> must be listed in bb-hosts. The operating-system type (and the optional
> client "class") is used to pick out an entry from the client-local.cfg
> file which is returned to the client system, but there is no requirement
> that these must be pre-defined anywhere in Hobbit.
>

Ah you are right...  My tests earlier this morning were failing for other 
reasons.
> > So I was curious if in the upcoming 4.3 (or newer version), that the
> > client command could be opened up for more general usage ?
>
> I think what you really want to be able to do is to combine client-side
> data from multiple sources - e.g. you have the Hobbit client running and
> feeding the normal client data into Hobbit, and then you have a number
> of custom add-ons that generate data, and you want this data to be
> available in Hobbit as if it had been part of the "normal" client data.
> Like
>     Hobbit client sends data (uptime, vmstat, ps-listing etc.)
>     Add-on script A sends additional data (eg. JVM performance metrics)
>     Add-on script B sends additional data (eg. custom application data)
>     User views the "clientlog" info on Hobbit and sees all three
>
Well that would be nice too, but actually where I am headed is multiple 
devices reporting from a single server. Lets say serverA runs 5 apache 
instances (not 5 vhosts). I would like to have in Hobbit 6 total devices; 
serverA, web1, web2, web3, web4, web5. So not sure if having all the 
clientlog data contained under serverA would be such a good idea. That 
clientlog could get to be rather large (I have some web servers that run 20+ 
apaches). So I was more thinking that each device in Hobbit would have it's 
own clientlog data. So clientlog data seperate or combined..  not sure.. you 
would be better to decide that.

I was also going to do this for my App servers. We run multiple JVMs or Apps 
on a given server. So base server stats are good, but it would be great to 
have other device entries in Hobbit that are completely focused for each JVM 
(just reports focused on that one JVM instance). Presently I have reports 
that I term Monolithic. They give reports on every JVM or apache in one 
report. There are some people that have issue with such large / multi app 
reports. When there is an alert, unclear what is the cause or when a new 
issue occurs it is lost due to a previous alert state from a differen JVM. By 
spliting the monotlithic reports into many reports for each instance, the 
page would then be more of a dashboard. JVM-A has a red on CPU or the DBPool 
for JVM-C is yellow. Then the user can drill down into the report to find out 
the details. Another benefit would be that I could then create Application 
Specific pages. On these App specific pages the application's entire tech 
stack; Web, App, DB, etc. could be displayed without any data from a 
different App. The Application teams would then have a clean page and see the 
health of their app at a glance.

And then if I needed semi-Monolithic, i.e. rollup the application's load 
balanced webservers, I can just use bbcombotest to combine the 4 web server 
device reports.

> It isn't difficult to implement, most of the pieces are already there.
> Some things that will need to be done in the Hobbit daemon:
> - store client-ext-FOO data separately from client-ext-BAR (to know what
>   to overwrite when an update arrives).
> - should there be a new protocol command for extensions? I think that
>   might be necessary.
> - Should there be a "client-local.cfg"-like file for extensions? It
>   could be useful.
> - for performance reasons, it would probably make sense to have some way
>   of splitting extension data off from the normal client data, and then
>   run a "client" channel worker handling each of the extension scripts.
>   That way we dont need to send all of the client data through all of
>   the add-on handlers who will just throw it away.
>
> > Now secondly after the data is actually accepted by the Server, the
> > ability to do something with it is needed. Now that shouldn't be a
> > problem with being able to attatch a script to a Hobbit channel or by
> > scheduling a command to run clientlog every X minutes and process the
> > data. But how does the script apply thresholds ? Now I am working under
> > the assumption that for my custom data I would be able to add custom
> > definitions to the hobbit-clients.cfg file. But how does my script access
> > those thresholds ? Since Hobbit already has all the internals to acess
> > this file, I would think that a Hobbit utility would fit the need. Maybe
> > even a new BB command ? i.e. "bin/bb
> > 1.2.3.4 "threshold server", where there could be an optional
> > parameter "stats=" which could be CPU, PROCS, DISK, etc. By storing the
> > custom ext thresholds in the hobbit-clients.cfg file, this allows the cgi
> > hobbit-confreport.sh to report on the custom thresholds along with the
> > built-in client thresholds.
>
> It's hard for me to see how difficult this would be to implement. I know
> the client configuration handling is a bit tricky in how it's
> implemented, I don't know what it takes to provide a generic interface
> to this for custom client-modules.
>
> A "BB" command to fetch the configuration? No, don't like that idea. The
> hobbitd daemon should not have to bother with configuration files it
> doesn't need for itself.
>
> Another complication is that the program using the configuration info is
> "long-lived", so it will have to reload the configuration when it
> changes. And that has some performance implications - the first attempt
> at the current client-configuration performed very poorly because the
> way it scanned all of the client configuration rules was extremely slow,
> simply because of all the pattern-matching of hostnames, testnames,
> colors, time-specifications etc. hobbitd_client now caches the list of
> relevant configuration rules for each host once it has determined what
> they are, and this gave quite a performance boost. Having 5-10 add-on
> handlers making the same mistake would kill your server.
>
> I'll have to think about this.
>

Yeah, I can only imagine the headache of trying to solve implementing powerful 
features and good performance. I have barely poked around with my fledging C 
experience. So I hope I do not come off asking for feature X, Y, Z and not 
appreciating the complexity. Or not jumping into the code, and trying to 
solve/implement myself. I was just thinking in my head of what features would 
make sense and how they might be implemented in the 'Hobbit way'.

> > Sorry for the long winded email, I am just trying to move my ext scripts
> > to the Hobbit methodology. It just makes more sense.. and offers many
> > more options & power. And this was just my take on it. Thoughts ?
>
> It's a good idea, and if it can be done right then it would fit in
> really well with the Hobbit philosophy of providing good basic
> monitoring, but also enabling you to add whatever extras you need.
> So I like it, but it does require some thought to get right.
>

I don't think I explained why I wanted to access the clientlog data, so I will 
try to explain. I had hoped to send more information to hobbit than just 
alerting stats. More like configuration & runtime information. Since I would 
then be monitoring most of the application stack (web, app, db) I could wrap 
a script around all that data, to query and provide up to the minute 
information on applications. For example what IPs are presently being used 
and for what. What http urls are which teams role to support (determined by 
what server the apache instance is running, etc). This would make support of 
my environment almost a breeze. Especially since I had planned to write the 
ext scripts in such a way that when a tech added web6 to serverA, the data 
would then be collected automatically and the device entry for web6 would be 
added to Hobbit automatically (deletes removals would be manual). So this 
extra data doesn't need to be seen in alerting reports, but it would be 
rather nice to query for it and know that the hobbit-clients/scripts where 
gathering and feeding Hobbit with the latest info.

Again, sorry for the very verbose email.
 ~Steve




More information about the Xymon mailing list