[hobbit] Multiple hobbit (bbproxy and bb server) queries
j.sansford at ntlworld.com
Tue Jul 7 22:31:37 CEST 2009
I'm using 4.2.3 - this is for a production area so I'm weary to try out betas currently, we need stability. I've configured bbproxy as a service on Solaris so it can bring itself back, but it will just crash again so I think we might miss quite a few messages..which links into my second point -> I was trying to find a way of configuring things such as conn, cpu etc to have a faster purple time and to alert for it. I've found the lifetime option but this as far as I can see is for custom stuff - not sure how to set it up for tests such as conn, ssh, cpu etc?
I'll try out your script tomorrow and have a play - as it currently stands if a bbserver dies for whatever reason (and therefore a bbproxy) it can look for up to half an hour like many things are "green" when they arn't infact being updated.
----- Original Message -----
From: Ralph Mitchell
To: hobbit at hswn.dk
Sent: Tuesday, July 07, 2009 6:31 PM
Subject: Re: [hobbit] Multiple hobbit (bbproxy and bb server) queries
What version of hobbit/xymon are you running?? I used to have a problem like that with 4.2. No bbproxy involved there, just several hobbit servers. If one of them was down, the server/bin/bb command would hang trying to talk to it. It should have either failed to make the connect or timed out, but it didn't. Anyway...
You could set up a simple heartbeat script, a bit like this:
$BB $BBDISP "status+2 bbproxyname.panicnow `date`
If this is purple, a bbproxy died."
Set that up to launch every minute. The message has a lifetime of 2 minutes, so it'll go purple about 3 minutes after the bbproxy hangs up or dies. You might want to pick a different column name. :)
On Tue, Jul 7, 2009 at 8:40 AM, <j.sansford at ntlworld.com> wrote:
You may remember my questions from last week. Thanks again for these. I have now implemented it however I have a few questions (and possibly bugs?). I will first start by describing the setup. To keep things simple I'll call each hobbit server IP as either "A" "B" or "C" depending on the data centre.
Data centre 1:
bbproxy and bbserver (running on same box, A). bbproxy configured to send to B,C,A. bbserver configured to talk to A,B,C in hobbitserver.cfg.
Data centre 2:
bbproxy and bbserver (running on same box, B). bbproxy configured to send to C,A,B. bbserver configured to talk to B,C,A in hobbitserver.cfg.
Data centre 3:
bbproxy and bbserver (running on same box, C). bbproxy configured to send to A,B,C. bbserver configured to talk to C,A,B in hobbitserver.cfg.
Firstly, everything looks good. However, if I am to stop bbserver at A (but keep bbproxy running at A) then shortly afterwards bbproxy at A will start crashing. I've tried changing the order of --bbdisplays and it seems like the bbproxy will crash if the last bbdisplay IP has been shutdown/not available. Is this known, or is there a workaround?
To explain this better - Lets say bbproxy at site B is configured as --bbdisplays=B,C,A. If I kill the xymon server at site A then this proxy will crash shortly afterwards. Note I'm on x86 Solaris.
My other question is this - currently if a proxy crashes and the other 2 xymon servers do not receive updates, most tests continue to stay green. I'm sure I've seen a configuration option but I can't seem to find it - can I configure these tests to go purple if they don't receive an update within the next 5-10 minutes? They've only just gone purple after 30 minutes, but we really need to know within 5 minutes if we haven't received a valid update.
To unsubscribe from the hobbit list, send an e-mail to
hobbit-unsubscribe at hswn.dk
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Xymon