[Xymon] Xymon Dependancies configuration.

Ralph M ralphmitchell at gmail.com
Thu Jun 4 23:50:50 CEST 2020


On Thu, Jun 4, 2020 at 3:36 PM <me at tdiehl.org> wrote:

> Hi,
>
> On Thu, 4 Jun 2020, Adam Thorn wrote:
>
> > On 03/06/2020 22:49, me at tdiehl.org wrote:
> >>  Hi,
> >>
> >>  I am trying to configure xymon dependencies so that if the core router
> is
> >>  down
> >>  my xymon server only pages me for the core router.
> >>
> >>  In reading the man page it says to do something like the following:
> >>
> >>  1.2.3.4 cg1.example.com # noconn https://cg1.example.com
> >>  depends=(http:router.example.com/conn)
> >>
> >>  The above works for a single service but the above host for example has
> >>  http and sslcert. How can I tell xymon that if router.example.com is
> down
> >>  all
> >>  of the other services for a host should go clear?
> >>
> >>  I tried setting the service to a * that does not work. and I tried
> listing
> >>  services separated with either a comma or a pipe but no joy.
> >
> > "man hosts.cfg" suggests that the syntax you want is
> >
> > depends=(testA:host1/test1,host2/test2),(testB:host3/test3)
> >
> > so for your example,
> >
> > depends=(http:router.example.com/conn),(sslcert:router.example.com/conn)
>
> That does not work for the sslcert test but does work for things like ssh.
> Which now makes sense given the info below.
>
> >
> > As the man page says, "depends" only applies to tests performed by
> xymonnet.
> > Wildcards do not appear to be supported but protocols.cfg will show you
> most
> > of the tests that xymonnet might perform.
>
> Ok, that explains why the neither the conn or sslcert test will not go
> clear.
> Neither test is listed in protocols.cfg. Given that both of these tests are
> network type tests it seems odd that they cannot be made to go clear on
> failure of another network test. I guess I do not really understand how
> Xymon works.
>
> I was really hoping to be able to get a single alert when the router went
> down. It does not happen real often but it is a pita to get several hundred
> text messages for what is really a single failure.
>
> Does anyone have a solution for these kinds of failures?
>

You could write an external script to connect to the router and "do stuff"
if the connection fails.

For example, if you're checking the router every 5 minutes, when it fails
you could send a "disable" message to Xymon for the list of things behind
the router, with a 10 minute lifetime.  That'll turn off alerts for all
those devices.  As long as the router continues to fail, keep on sending
disables with 10 min lifetime, essentially extending the original
lifetime.  Once the router recovers, the disable message will expire up to
10 mins later and those devices will alert or not depending on their next
status.

I don't have such a script, but it feels like it ought to be fairly trivial
to implement.

Ralph Mitchell
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20200604/8dcb4132/attachment.htm>


More information about the Xymon mailing list