[hobbit] Dependency sytem in Hobbit
Marganne, Etienne
emarganne at be.tiauto.com
Mon Jan 22 13:18:33 CET 2007
Hello all,
After another deep look into Hobbit mechanisms, it looks like the idea of
using the combo tests to create a tree structure is good and you can link
that structure to the alerts. This is done by putting rules that are based
upon combo tests since they are considered as standard tests (with an
historic, statistics ...).
Now the idea would to stop the alert floods triggered when a critical device
on a path falls down. The tree structure helps gathering all the necessary
informations into one test. We can now trigger alerts on those alerts
preferably than on common bb-hosts tests, but the flood still remains. It
remains because you have to trigger alerts on each combo tests to know
exactly where it has failed. And since all your combo tests are involved
whenever a failure occurs at some point the following nodes will be
unreachable in this case:
A - 1 - 2 - 3 - F
If anybody has an idea he is welcome ... :-)
Etienne Marganne,
TI Automotive.
_____
From: Jones, Jason (Altrincham) [mailto:JasonAS_Jones at mentor.com]
Sent: vendredi 19 janvier 2007 17:10
To: hobbit at hswn.dk
Subject: RE: [hobbit] Dependency sytem in Hobbit
Ok, have just read this E-mail but I thought that if you did the route tag
on a host it only went yellow (unreachable by proxy) if itself and a device
in it's route entry failed (i.e. if the gateway fails but the host still
responds to ping then it stays green) though Henrik would have to confirm
this as I haven't had such a situation. Also I'm curious how the CPU, hard
drive etc, dependencies would work? If CPU on host1 has a high load what
effect would that have on host2? Unless of course a database was being
hosted by host1...
Also can I ask, how large is this network? Henrik's is ~4000 or something
ours is about 400-500.
Jason.
_____
From: Hubbard, Greg L [mailto:greg.hubbard at eds.com]
Sent: 19 January 2007 14:56
To: hobbit at hswn.dk
Subject: RE: [hobbit] Dependency sytem in Hobbit
Etienne -- you are going to have to find someone to write that and add it to
Hobbit. Path-based alarm suppression is one of the holy grails in the
network management industry, and the reason it has not yet been solved is
because it is a difficult problem. For small networks you can come up with
a solution, but if you are using VLANs and WAN's and load balancers and all
that other stuff it gets to be rather difficult.
There are many commercial software vendors that claim to have this problem
solved -- but sometimes even their demos do not work. The little bit of
dependency specification that you can put into Hobbit does indeed work, but
not across the board.
GLH
_____
From: Marganne, Etienne [mailto:emarganne at be.tiauto.com]
Sent: Friday, January 19, 2007 5:58 AM
To: hobbit at hswn.dk
Cc: henrik at hswn.dk
Subject: [hobbit] Dependency sytem in Hobbit
Hello all,
I am Etienne Marganne, a new comer in TI Automotive, a society which uses
Hobbit as a monitoring tool, I will be in charge of Hobbit tool. We would
like to enhance our monitoring tool in order to get more detailed
informations through it.
One of the first thing we would like to do is to create a dependency system
between monitored hosts across our network. For all the further discussion,
please keep in mind that we have a very large network and two Hobbit servers
where we would like to keep the same "hh-hosts" file.
The idea of the dependency system is the following one: once we have a host
that fails on a network path in our network all the hosts further that one
will be very likely unreachable. This will cause a lot of alerts to be
triggered because of one failure. This is not interesting because our team
will be flooded by those alerts. Therefore it could be useful to create
dependencies so that those further devices on the same path will not trigger
alerts (basically turning red).
There is a specific tag that can be used to do such a work, the "route" tag.
In our case with two Hobbit servers, we would tune that with the
"route_BBLOCATION" tag. Knowing that there are Hobbit clients on all the
network nodes between the two endpoints, the simplest idea would be to list
all the nodes in the description of those tags. This could work even if it
would generate of job. A little bit further here, let's say that one node
fails on a path, it is very probable that if you know that the following
nodes are unreachable, you may not want to test them anymore (to forbid them
to turn red).
Now think of a big network with a lot of redundancy, it is very probable
that there will be indeed a lot of network paths between a server and one
final host. If on one path one node fails, it does not mean that the final
client is not reachable. But with the "route" tag, that client would be
signal as unreachable since a member of the "route" has failed. This is not
comfortable at all.
What we would like to know, or to get, if there is a way to get this
dependency system work:
Hobbit Server A ---- 1 ---- 2 ---- 3 ---- 4 ---- Final Client
| |
| --------- 5 ------------6 ----------- |
There are two paths to reach the Final Client, one composed by 1, 2, 3, 4
and another one composed by 5, 6.
With the current "route" tag we would have such a list of nodes:
route_HobbitServerA:1,2,3,4,5,6 then if 1 fails and the path 1-2-3-4 would
not work anymore but the 5-6 one would still.
A good dependency system would to have such a thing: Final Client depends on
4, 4 depends on 3, 3 depends on 2, 2 depends on 1 but Final Clients also
depends on 6 which depends on 5. More over this would also solve that kind
of topology :
Hobbit Server A ---- 1 ---- 2 ---- 3 ---- 4 ---- Final Client
| | |
| --------- 5 ------------6 ----------- | Where
there is a link between 2 and 5, 2 and 6, 5 and 3, 6 and 3.
Maybe that something could be set up with the "depends" tag, however I do
not know how the informations will propagate through the different
dependencies done that tag.
Even further on the topic, the "route" tag performs only ping tests which
does not seem enough to me. I would like to add cpu, disks, ... tests to the
whole dependency system.
Thank you for your help and answers,
Regards,
Etienne Marganne
TI Automotive.
The information contained in this transmission may contain privileged and
confidential information. It is intended only for the use of the person(s)
named above. If you are not the intended recipient, you are hereby notified
that any review, dissemination, distribution or duplication of this
communication is strictly prohibited. If you are not the intended recipient,
please contact the sender by reply email and destroy all copies of the
original message.
The information contained in this transmission may contain privileged and confidential information. It is intended only for the use of the person(s) named above. If you are not the intended recipient, you are hereby notified that any review, dissemination, distribution or duplication of this communication is strictly prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20070122/7c54ef77/attachment.html>
More information about the Xymon
mailing list