[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [hobbit] grouping methods
- To: hobbit (at) hswn.dk
- Subject: Re: [hobbit] grouping methods
- From: "Josh Luthman" <josh (at) imaginenetworksllc.com>
- Date: Mon, 16 Jun 2008 14:27:25 -0400
- References: <4851E9D6.1090704 (at) campnerd.com> <1055CBB0108591479A515BB320DA7A750B885836 (at) CINMLVEM19.e2k.ad.ge.com> <961092e10806161024w29b41631p9673c476a2706800 (at) mail.gmail.com> <4856A6C4.5070904 (at) tmsusa.com> <1055CBB0108591479A515BB320DA7A750B88593D (at) CINMLVEM19.e2k.ad.ge.com> <4856AE3B.3020701 (at) wi.rr.com>
Yes - I have that setup with customers' routers and CPEs.
The real problem is when, for example, 3 servers in one data center in
New Mexico lose connectivity with us in Ohio. Then I get 3 SMS
messages on my phone, followed by 3 more when it comes back up.
It would be very convenient to have 1 messages saying this, that and
another thing went down in the last 60s.
On Mon, Jun 16, 2008 at 2:17 PM, Rich Smrcina <rsmrcina (at) wi.rr.com> wrote:
> If this is a situation of routed networks, Hobbit can know about that with
> directives in the bb-hosts file. If it knows a host behind a router is
> down, it will only notify for the router, not the hosts behind the router.
>
> Linder, Doug (SABIC Innovative Plastics, consultant) wrote:
>>
>> Sloan [mailto:joe (at) tmsusa.com] wrote:
>>
>>> We've not had a bb server go down in all the years we've been using it,
>>> but sometimes wan connectivity goes away due to circumstances beyond our
>>> control
>>
>> This is by far the biggest annoyance we have with all system monitoring
>> - when networks go down. It's a problem with every monitoring tool
>> there is and I can't think of any way to solve it: the monitoring system
>> has no way of knowing whether a system is down because it crashed or if
>> it's down because the network went down. All it knows is that it can't
>> talk to the system anymore and something is wrong, so it generates an
>> alert. When a whole network goes down, it can become hundreds of
>> simultaneous alerts. And that's annoying enough when it's just email
>> alerts. When you use Hobbit to generate cases in your trouble ticket
>> system, that can be hundreds of new, useless cases to manually close.
>>
>> We don't want to raise the amount of time a system has to be down before
>> Hobbit generates an alert, because we want to know as soon as possible.
>> But if we keep that number too low, then when the network has a brief
>> hiccup, we get hundreds of redundant cases. This is especially a
>> problem with overseas networks on the WAN.
>>
>> I think the only possible solution would be for Hobbit to have some kind
>> of flood-detection routine built in, where it could tell how rapidly it
>> was sending alerts about connection problems for machines all on the
>> same network, and was smart enough to think "Whoa, I'm about to send 100
>> connection alarms about systems on the same network.... Instead of
>> sending 100 of them, maybe I'll just send ONE alert saying "You got a
>> big problem here."
>>
>> Doug Linder
>>
>> To unsubscribe from the hobbit list, send an e-mail to
>> hobbit-unsubscribe (at) hswn.dk
>>
>>
>
> --
> Rich Smrcina
> VM Assist, Inc.
> Phone: 414-491-6001
> Ans Service: 360-715-2467
> rich.smrcina at vmassist.com
> http://www.linkedin.com/in/richsmrcina
>
> Catch the WAVV! http://www.wavv.org
> WAVV 2009 - Orlando, FL - May 15-19, 2009
>
> To unsubscribe from the hobbit list, send an e-mail to
> hobbit-unsubscribe (at) hswn.dk
>
>
>
--
Josh Luthman
Office: 937-552-2340
Direct: 937-552-2343
1100 Wayne St
Suite 1337
Troy, OH 45373
Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer