[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [hobbit] Multiple hobbit (bbproxy and bb server) queries



The conn and ssh tests are handled by bbnet on the xymon server.  The cpu,
disks, memory, etc are run on each client system.  Personally I wouldn't
want to fool around with those.

The point of the little test script was that, with a much shorter lifetime,
those messages will show purple dots a lot quicker when the bbproxy dies.
 You'll also only get a maximum of three to start with.  All the other
purple dots will come along 25 minutes later, if nobody fixes the bbproxy.

Ralph Mitchell


On Tue, Jul 7, 2009 at 3:31 PM, James <j.sansford (at) ntlworld.com> wrote:

>  Hi Ralph,
>
> I'm using 4.2.3 - this is for a production area so I'm weary to try out
> betas currently, we need stability. I've configured bbproxy as a service on
> Solaris so it can bring itself back, but it will just crash again so I think
> we might miss quite a few messages..which links into my second point -> I
> was trying to find a way of configuring things such as conn, cpu etc to have
> a faster purple time and to alert for it. I've found the lifetime option but
> this as far as I can see is for custom stuff - not sure how to set it up for
> tests such as conn, ssh, cpu etc?
>
> I'll try out your script tomorrow and have a play - as it currently stands
> if a bbserver dies for whatever reason (and therefore a bbproxy) it can look
> for up to half an hour like many things are "green" when they arn't infact
> being updated.
>
> Cheers
> James
>
> ----- Original Message -----
> *From:* Ralph Mitchell <ralphmitchell (at) gmail.com>
> *To:* hobbit (at) hswn.dk
> *Sent:* Tuesday, July 07, 2009 6:31 PM
> *Subject:* Re: [hobbit] Multiple hobbit (bbproxy and bb server) queries
>
> What version of hobbit/xymon are you running??  I used to have a problem
> like that with 4.2.  No bbproxy involved there, just several hobbit servers.
>  If one of them was down, the server/bin/bb command would hang trying to
> talk to it.  It should have either failed to make the connect or timed out,
> but it didn't.  Anyway...
> You could set up a simple heartbeat script, a bit like this:
>
>      #!/bin/bash
>
>      $BB $BBDISP "status+2 bbproxyname.panicnow `date`
>         If this is purple, a bbproxy died."
>
> Set that up to launch every minute.  The message has a lifetime of 2
> minutes, so it'll go purple about 3 minutes after the bbproxy hangs up or
> dies.  You might want to pick a different column name.  :)
>
> Ralph Mitchell
>
>
> On Tue, Jul 7, 2009 at 8:40 AM, <j.sansford (at) ntlworld.com> wrote:
>
>> Hi guys,
>>
>> You may remember my questions from last week. Thanks again for these. I
>> have now implemented it however I have a few questions (and possibly bugs?).
>> I will first start by describing the setup. To keep things simple I'll call
>> each hobbit server IP as either "A" "B" or "C" depending on the data centre.
>>
>> Data centre 1:
>> bbproxy and bbserver (running on same box, A). bbproxy configured to send
>> to B,C,A. bbserver configured to talk to A,B,C in hobbitserver.cfg.
>>
>> Data centre 2:
>> bbproxy and bbserver (running on same box, B). bbproxy configured to send
>> to C,A,B. bbserver configured to talk to B,C,A in hobbitserver.cfg.
>>
>> Data centre 3:
>> bbproxy and bbserver (running on same box, C). bbproxy configured to send
>> to A,B,C. bbserver configured to talk to C,A,B in hobbitserver.cfg.
>>
>> ---------------------------------------------------
>> Firstly, everything looks good. However, if I am to stop bbserver at A
>> (but keep bbproxy running at A) then shortly afterwards bbproxy at A will
>> start crashing. I've tried changing the order of --bbdisplays and it seems
>> like the bbproxy will crash if the last bbdisplay IP has been shutdown/not
>> available. Is this known, or is there a workaround?
>>
>> To explain this better - Lets say bbproxy at site B is configured as
>> --bbdisplays=B,C,A. If I kill the xymon server at site A then this proxy
>> will crash shortly afterwards. Note I'm on x86 Solaris.
>>
>>
>> My other question is this - currently if a proxy crashes and the other 2
>> xymon servers do not receive updates, most tests continue to stay green. I'm
>> sure I've seen a configuration option but I can't seem to find it - can I
>> configure these tests to go purple if they don't receive an update within
>> the next 5-10 minutes? They've only just gone purple after 30 minutes, but
>> we really need to know within 5 minutes if we haven't received a valid
>> update.
>>
>> Many thanks,
>> James.
>>
>> To unsubscribe from the hobbit list, send an e-mail to
>> hobbit-unsubscribe (at) hswn.dk
>>
>>
>>
>