<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p>Hi,</p>
    <p>I remember looking into this a long time ago, and the
      --dnstimeout setting does not quite work as expected - because
      C-ARES does not quite work as expected.</p>
    <p>C-ARES has some timeout settings for queries, but it performs an
      exponential back-off between queries, so it is impossible to
      really hit the exact timeout you specify in --dnstimeout.</p>
    <p>In fact, current 4.3.x versions have a hard-coded setting for the
      C-ARES timeouts - it starts with a 2 second timeout and performs 4
      attempts, which ends up with approximately 23 second timeout for
      all DNS queries. This is in xymonnet/dns.c (look for "ARES
      timeout"). If you need those really short timeouts, then that is
      probably what you should change.<br>
    </p>
    <p><br>
    </p>
    <p>Regards,<br>
      Henrik<br>
    </p>
    <br>
    <div class="moz-cite-prefix">On 11-09-2017 05:52, Jeremy Laidman
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CACO=ejzNx8h95yL6PozO4vDW508t=QOzRFF87J5AkcoW6tdC3w@mail.gmail.com">
      <div dir="ltr">Hi
        <div><br>
        </div>
        <div>I'm reviving an old thread, because this is biting me
          again, so I wanted to know if anyone had any fresh ideas on
          this problem.</div>
        <div><br>
        </div>
        <div>Many of the servers I monitor are DNS servers, so the
          C-ARES library has a lot of queries to perform every 5
          minutes. In some cases, I want to ensure that a DNS service is
          down (and alert when not) so most of the time I can expect a
          timeout, leading to a long poll cycle. I'd really like to be
          able to drop the timeout to significantly less than the 23
          seconds it's taking now per server.</div>
        <div><br>
        </div>
        <div>Cheers</div>
        <div>Jeremy</div>
        <div><br>
        </div>
      </div>
      <div class="gmail_extra"><br>
        <div class="gmail_quote">On 3 June 2015 at 13:49, Jeremy Laidman
          <span dir="ltr"><<a href="mailto:jlaidman@rebel-it.com.au"
              target="_blank" moz-do-not-send="true">jlaidman@rebel-it.com.au</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div dir="ltr">
              <div>OK, I'm a bit puzzled by this, and definitely pushing
                the envelope of my debugging and C coding skills.  The
                relevant code from xymonnet/dns.c is:<br>
                <br>
                    168                 tv.tv_sec = dnstimeout;
                tv.tv_usec = 0;<br>
                    169                 tvp = ares_timeout(channel,
                &tv, &tv);<br>
                <br>
              </div>
              I ran this through gdb, with "--dns-timeout=3" specified,
              setting a breakpoint at line 168.  I confirmed that
              dnstimeout is set to 3.  When I step one line, I should
              see tv.tv_sec set to 3 also, but it's set to 0.<br>
              <div><br>
              </div>
              <div>If I don't specify --dns-timeout at all, printing
                dnstimeout shows "30".  Again, after stepping to the
                next line, tv.tv_sec is still zero.<br>
                <br>
                Breakpoint 1, dns_ares_queue_run (channel=0x58b1c0) at
                dns.c:168<br>
                168                     tv.tv_sec = dnstimeout;
                tv.tv_usec = 0;<br>
                (gdb) p dnstimeout<br>
                $14 = 30<br>
                (gdb) n<br>
                169                     tvp = ares_timeout(channel,
                &tv, &tv);<br>
                (gdb) p tv<br>
                $15 = {tv_sec = 0, tv_usec = 0}<br>
                (gdb)<br>
                <br>
              </div>
              <div>So what gives here?<span class="HOEnZb"><font
                    color="#888888"><br>
                    <br>
                  </font></span></div>
              <span class="HOEnZb"><font color="#888888">
                  <div>J<br>
                    <br>
                  </div>
                </font></span></div>
            <div class="HOEnZb">
              <div class="h5">
                <div class="gmail_extra"><br>
                  <div class="gmail_quote">On 3 June 2015 at 13:08,
                    Jeremy Laidman <span dir="ltr"><<a
                        href="mailto:jlaidman@rebel-it.com.au"
                        target="_blank" moz-do-not-send="true">jlaidman@rebel-it.com.au</a>></span>
                    wrote:<br>
                    <blockquote class="gmail_quote" style="margin:0 0 0
                      .8ex;border-left:1px #ccc solid;padding-left:1ex">
                      <div dir="ltr">
                        <div>
                          <div>
                            <div>
                              <div>
                                <div>
                                  <div>
                                    <div>
                                      <div>
                                        <div>Hi<br>
                                          <br>
                                        </div>
                                        <div>I'm running Xymon v4.3.10
                                          on Linux, and I'm quite sure
                                          it's compiled with c-ares
                                          support.<br>
                                          <br>
                                        </div>
                                        I have 12 new DNS servers that
                                        were added to Xymon about one
                                        month ago.  All of my server
                                        entries in hosts.cfg have
                                        "testip".  The tasks.cfg runs
                                        xymonet with "--dns-timeout=3". 
                                        The hosts entries look like so:<br>
                                        <br>
                                        10.10.10.1 <a
                                          href="http://dnshost1.example.com"
                                          target="_blank"
                                          moz-do-not-send="true">dnshost1.example.com</a>   
                                        # testip dns=NS:<a
                                          href="http://example.com"
                                          target="_blank"
                                          moz-do-not-send="true">example.com</a>,SOA:<a
                                          href="http://example.com"
                                          target="_blank"
                                          moz-do-not-send="true">example<wbr>.com</a><br>
                                        <br>
                                        About a week ago, connectivity
                                        to all of these servers failed,
                                        and at the same time, the
                                        xymonnet run time jumped from
                                        less than 15 seconds to about
                                        330 seconds, so about 315
                                        seconds extra.  The xymonnet
                                        page says 295 seconds is taken
                                        up by DNS tests.<br>
                                        <br>
                                      </div>
                                      If the increase in time taken is
                                      about 315 and is entirely due to
                                      the 12 servers failing, then each
                                      failed server is adding about 26
                                      seconds to the total run time.<br>
                                      <br>
                                    </div>
                                    I don't think this should be
                                    happening like this.  With two DNS
                                    checks per server, the DNS checks
                                    should be taking 6 seconds each to
                                    time-out, not 26.  If I run xymonnet
                                    with "--timing --no-update" and
                                    specify only one hostname, I can
                                    view the results and the timing. 
                                    This shows that the ping check gets
                                    reported after about 3 seconds, and
                                    then the DNS tests are executed and
                                    take 26 seconds total.<br>
                                    <br>
                                  </div>
                                  My naiive assumption was that when a
                                  server failed a ping (and didn't have
                                  "noclear" defined in hosts.cfg) then
                                  the network checks would be skipped. 
                                  On re-reading the man page for
                                  hosts.cfg, it dawned on me that a
                                  failed ping simply suppresses failed
                                  test /results/, but doesn't stop the
                                  tests from being run.<br>
                                  <br>
                                </div>
                                So the real problem is that the
                                "--dns-timeout=3" is not being taken
                                into consideration by xymonnet.  If I
                                run xymonnet with "--debug" it tells me:<br>
                                <br>
                                1900 2015-06-03 12:02:20 ares_search:
                                tlookup='<a href="http://example.com"
                                  target="_blank" moz-do-not-send="true">example.com</a>',
                                class=1, type=2<br>
                                1900 2015-06-03 12:02:20 ares_search:
                                tlookup='<a href="http://example.com"
                                  target="_blank" moz-do-not-send="true">example.com</a>',
                                class=1, type=6<br>
                                1900 2015-06-03 12:02:20 Processing 0
                                DNS lookups with ARES<br>
                                1900 2015-06-03 12:02:46 Finished ARES
                                queue after loop 423<br>
                                <br>
                              </div>
                              This is peculiar.  Why would it say
                              "processing 0 DNS lookups" when there are
                              two lookups to test?  Could this be
                              because xymonnet hasn't actually been
                              built with ARES support and I didn't know
                              it?  Is there a good way to tell?  If I
                              add "--no-ares" I get the same results
                              perhaps suggesting a lack of ARES
                              support.  On the other hand, if I add
                              "timeout:3" and "attempts:1" into
                              resolv.conf, I also get the same results. 
                              If I run "nm /path/to/xymonnet | grep
                              gethostby" it returns
                              "ares_gethostbyname".<br>
                              <br>
                            </div>
                            <div>Just for fun, I compiled Xymon v4.3.21
                              and ran the xymonnet binary from there,
                              with no change in behaviour.  I also tried
                              removing the "--dns-timeout" option so
                              that it defaults to 30 seconds, but still
                              no change - 26 seconds for two DNS tests.<br>
                            </div>
                            <div><br>
                            </div>
                            So, I'm not really sure what the problem is,
                            but xymonnet certainly isn't behaving as I
                            would expect.<br>
                            <br>
                          </div>
                          Cheers<span
                            class="m_-4817356621273579713HOEnZb"><font
                              color="#888888"><br>
                            </font></span></div>
                        <span class="m_-4817356621273579713HOEnZb"><font
                            color="#888888">Jeremy<br>
                            <br>
                          </font></span></div>
                    </blockquote>
                  </div>
                  <br>
                </div>
              </div>
            </div>
          </blockquote>
        </div>
        <br>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
Xymon mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Xymon@xymon.com">Xymon@xymon.com</a>
<a class="moz-txt-link-freetext" href="http://lists.xymon.com/mailman/listinfo/xymon">http://lists.xymon.com/mailman/listinfo/xymon</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>