[Xymon] DNS failures causing "runtime longer than time limit"

Jeremy Laidman jlaidman at rebel-it.com.au
Wed Jun 3 05:49:21 CEST 2015


OK, I'm a bit puzzled by this, and definitely pushing the envelope of my
debugging and C coding skills.  The relevant code from xymonnet/dns.c is:

    168                 tv.tv_sec = dnstimeout; tv.tv_usec = 0;
    169                 tvp = ares_timeout(channel, &tv, &tv);

I ran this through gdb, with "--dns-timeout=3" specified, setting a
breakpoint at line 168.  I confirmed that dnstimeout is set to 3.  When I
step one line, I should see tv.tv_sec set to 3 also, but it's set to 0.

If I don't specify --dns-timeout at all, printing dnstimeout shows "30".
Again, after stepping to the next line, tv.tv_sec is still zero.

Breakpoint 1, dns_ares_queue_run (channel=0x58b1c0) at dns.c:168
168                     tv.tv_sec = dnstimeout; tv.tv_usec = 0;
(gdb) p dnstimeout
$14 = 30
(gdb) n
169                     tvp = ares_timeout(channel, &tv, &tv);
(gdb) p tv
$15 = {tv_sec = 0, tv_usec = 0}
(gdb)

So what gives here?

J


On 3 June 2015 at 13:08, Jeremy Laidman <jlaidman at rebel-it.com.au> wrote:

> Hi
>
> I'm running Xymon v4.3.10 on Linux, and I'm quite sure it's compiled with
> c-ares support.
>
> I have 12 new DNS servers that were added to Xymon about one month ago.
> All of my server entries in hosts.cfg have "testip".  The tasks.cfg runs
> xymonet with "--dns-timeout=3".  The hosts entries look like so:
>
> 10.10.10.1 dnshost1.example.com    # testip dns=NS:example.com,SOA:
> example.com
>
> About a week ago, connectivity to all of these servers failed, and at the
> same time, the xymonnet run time jumped from less than 15 seconds to about
> 330 seconds, so about 315 seconds extra.  The xymonnet page says 295
> seconds is taken up by DNS tests.
>
> If the increase in time taken is about 315 and is entirely due to the 12
> servers failing, then each failed server is adding about 26 seconds to the
> total run time.
>
> I don't think this should be happening like this.  With two DNS checks per
> server, the DNS checks should be taking 6 seconds each to time-out, not
> 26.  If I run xymonnet with "--timing --no-update" and specify only one
> hostname, I can view the results and the timing.  This shows that the ping
> check gets reported after about 3 seconds, and then the DNS tests are
> executed and take 26 seconds total.
>
> My naiive assumption was that when a server failed a ping (and didn't have
> "noclear" defined in hosts.cfg) then the network checks would be skipped.
> On re-reading the man page for hosts.cfg, it dawned on me that a failed
> ping simply suppresses failed test /results/, but doesn't stop the tests
> from being run.
>
> So the real problem is that the "--dns-timeout=3" is not being taken into
> consideration by xymonnet.  If I run xymonnet with "--debug" it tells me:
>
> 1900 2015-06-03 12:02:20 ares_search: tlookup='example.com', class=1,
> type=2
> 1900 2015-06-03 12:02:20 ares_search: tlookup='example.com', class=1,
> type=6
> 1900 2015-06-03 12:02:20 Processing 0 DNS lookups with ARES
> 1900 2015-06-03 12:02:46 Finished ARES queue after loop 423
>
> This is peculiar.  Why would it say "processing 0 DNS lookups" when there
> are two lookups to test?  Could this be because xymonnet hasn't actually
> been built with ARES support and I didn't know it?  Is there a good way to
> tell?  If I add "--no-ares" I get the same results perhaps suggesting a
> lack of ARES support.  On the other hand, if I add "timeout:3" and
> "attempts:1" into resolv.conf, I also get the same results.  If I run "nm
> /path/to/xymonnet | grep gethostby" it returns "ares_gethostbyname".
>
> Just for fun, I compiled Xymon v4.3.21 and ran the xymonnet binary from
> there, with no change in behaviour.  I also tried removing the
> "--dns-timeout" option so that it defaults to 30 seconds, but still no
> change - 26 seconds for two DNS tests.
>
> So, I'm not really sure what the problem is, but xymonnet certainly isn't
> behaving as I would expect.
>
> Cheers
> Jeremy
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20150603/4aadf8e4/attachment.html>


More information about the Xymon mailing list