[Xymon] Antwort: [SPAM] Re: Xymon disruption every night!

Lukas Kohl lukas.kohl at ergodirekt.de
Tue Feb 16 11:11:27 CET 2016


Hi,
i know this is just a Workaround, but maybe you can profit.
I have a xymon machine with a local caching bind daemon, which also helps 
to improve the Speed of the DNS Tests a lot.

1. yum install bind
2. customize /etc/named:
        options {
        listen-on port 53 { 127.0.0.1; };
        #listen-on-v6 port 53 { ::1; };
        directory       "/var/named";
        dump-file       "/var/named/data/cache_dump.db";
        statistics-file "/var/named/data/named_stats.txt";
        memstatistics-file "/var/named/data/named_mem_stats.txt";
        allow-query     { localhost; };
        #recursion yes;
        forwarders { foo1; foo2; };
        forward only;
        notify no;
 
        dnssec-enable no;
        dnssec-validation no;
        #dnssec-lookaside auto;
 
        /* Path to ISC DLV key */
        bindkeys-file "/etc/named.iscdlv.key";
 
        managed-keys-directory "/var/named/dynamic";
        };
 
        zone "." IN {
        type hint;
        file "named.ca";
        };
 
        include "/etc/named.rfc1912.zones";
        include "/etc/named.root.key";
3. Make sure named.conf is 640
4. Enhance /etc/resolv.conf: nameserver 127.0.0.1 


Regards,

     Lukas Kohl
     ERGO Direkt Versicherungen
     Systembetrieb 2
     Karl-Martell-Straße 60
     90344 Nürnberg
     Deutschland
     Tel.: +49-911-148-2857




Von:    L-M-J <linuxmasterjedi at free.fr>
An:     Xymon at xymon.com
Datum:  16.02.2016 10:46
Betreff:        [SPAM] Re: [Xymon] Xymon disruption every night!
Gesendet von:   "Xymon" <xymon-bounces at xymon.com>




Hi,

I'm still running into troubles every night between ~0h30 and ~2h40 :-(
1) I checked the backup on my physical XYmon server : around 9pm and runs 
for 4:45 min.
2) We cross-monitored the DNS server from another monitoring tool : no DNS 
outage detected.
3) I monitored the Xymon server network link state with "mii-tool" every 
seconds : no troubles detected
4) I pinged my Xymon servers from 2 differents network places all night 
long : no troubles detected.
5) No firewalls between my Xymon server and the monitored hosts
6) Over 500 hosts, only ~30 are in trouble every night and mostly the same
7) Hosts are VM, physical servers, public internet website


Here is what I've found in the xymond.log today :
2016-02-16 02:02:57 Flapping detected for www.foo1.com:http - 5 changes in 
1708 seconds
2016-02-16 02:02:57 Flapping detected for www.foo2.com:http - 5 changes in 
1708 seconds
2016-02-16 02:02:57 Flapping detected for www.microsoft.com:http - 5 
changes in 1708 seconds
2016-02-16 02:06:14 Flapping detected for server01:http - 5 changes in 
1678 seconds
2016-02-16 02:06:14 Flapping detected for server02:http - 5 changes in 
1678 seconds
2016-02-16 02:06:29 Flapping detected for server03:conn - 5 changes in 
1745 seconds
2016-02-16 02:07:21 Flapping detected for server04:ldap - 5 changes in 
1745 seconds
2016-02-16 02:07:21 Flapping detected for server06:ssh - 5 changes in 1745 
seconds
2016-02-16 02:07:21 Flapping detected for server05:http - 5 changes in 
1745 seconds
2016-02-16 02:07:21 Flapping detected for server07:http - 5 changes in 
1745 seconds
2016-02-16 02:07:21 Flapping detected for server08:http - 5 changes in 
1745 seconds
2016-02-16 02:07:21 Flapping detected for server09:http - 5 changes in 
1745 seconds
2016-02-16 02:07:21 Flapping detected for foo.bar1.com:http - 5 changes in 
1745 seconds
2016-02-16 02:07:21 Flapping detected for foo.bar2.com:http - 5 changes in 
1745 seconds
2016-02-16 02:07:21 Flapping detected for foo.bar3.fr:http - 5 changes in 
1745 seconds
2016-02-16 02:07:21 Flapping detected for server10:http - 5 changes in 
1745 seconds
2016-02-16 02:07:21 Flapping detected for server11-t:http - 5 changes in 
1745 seconds
2016-02-16 02:07:21 Flapping detected for server12:http - 5 changes in 
1745 seconds
2016-02-16 02:07:21 Flapping detected for server13:http - 5 changes in 
1745 seconds
2016-02-16 02:07:21 Flapping detected for server14:http - 5 changes in 
1745 seconds
2016-02-16 02:07:21 Flapping detected for server15:http - 5 changes in 
1745 seconds
2016-02-16 02:07:21 Flapping detected for server16:http - 5 changes in 
1745 seconds
2016-02-16 02:07:21 Flapping detected for server17:http - 5 changes in 
1745 seconds
2016-02-16 02:07:21 Flapping detected for server18:http - 5 changes in 
1745 seconds
2016-02-16 02:07:21 Flapping detected for server19:http - 5 changes in 
1745 seconds


Here is a part of the configuration + errors displayed in the XYmon HTTP 
interface :
hosts.cfg : 0.0.0.0 server03 # conn NAME:"server03" DESCR:"VM FOO BAR"
Error : conn NOT ok : DNS lookup failed / Unable to resolve hostname 
server03
System unreachable for 2 poll periods (86 seconds)

Everything looks like the DNS resolution failed.

hosts.cfg : 10.X.Y.188 server05 # conn tse NAME:"Server 05" DESCR:"My 
comment" http://server05/
Error : DNS error red http://server05/ - DNS error

- Why I have a "DNS error" here ? I set up the IP yesterday to this host 
to solve the issue. The "conn" error disappear since yesterday evening but 
the http still remains.




Le 29 janvier 2016 13:22:06 GMT+01:00, Becker Christian 
<christian.becker at rhein-zeitung.net> a écrit :
My intention was the figure out if the network connection of the Xymon 
server itself has a problem…
For example, if your Xymon server is hardware, then it has a wired network 
interface that is connected to a network switch. That’s your link between 
the Xymon server and all of your other VMs and physical servers.
From my side, if you only see problems on the Xymon server, I’ld have a 
look at this particular switch port or the cable infrastructure to the 
Xymon server. Or could there be a firewall rule preventing the Xymon 
server accessing the DNS server?
 
By the way – do you have only one DNS server in /etc/resolv.conf? Did you 
check the logs on your DNS server? Can you issue a continuous ping to the 
Xymon server to see if it loses some packages in 24hours?
 
Regards
Christian
 
 
Christian Becker
IT-Services
 
Christian.Becker at rhein-zeitung.net
_________________________________
Mittelrhein-Verlag GmbH
August-Horch-Straße 28
D-56070 Koblenz
Verleger und Geschäftsführer: Walterpeter Twer
Reg.-Gericht Koblenz HRB 121
Finanzamt Koblenz Str.Nr. 22 65 10 285 2
www.rhein-zeitung.de
 
Von: Xymon [mailto:xymon-bounces at xymon.com] Im Auftrag von L-M-J
Gesendet: Freitag, 29. Januar 2016 13:07
An: Xymon at xymon.com
Betreff: Re: [Xymon] Xymon disruption every night!
 
Problems appears on VMs and physical servers and Lan and DMZ equipments. I 
don't see a link between those devices :-( 

Le 29 janvier 2016 09:23:14 GMT+01:00, Becker Christian <
christian.becker at rhein-zeitung.net> a écrit :
Hi L-M-J,
 
can you exclude that this behavior is coming from any network device like 
a switch or default gateway?
 
Regards
Christian
 
Christian Becker
IT-Services
 
Christian.Becker at rhein-zeitung.net
_________________________________
Mittelrhein-Verlag GmbH
August-Horch-Straße 28
D-56070 Koblenz
Verleger und Geschäftsführer: Walterpeter Twer
Reg.-Gericht Koblenz HRB 121
Finanzamt Koblenz Str.Nr. 22 65 10 285 2
www.rhein-zeitung.de
 
Von: Xymon [mailto:xymon-bounces at xymon.com] Im Auftrag von L-M-J
Gesendet: Freitag, 29. Januar 2016 08:57
An: Xymon at xymon.com
Betreff: [Xymon] Xymon disruption every night!
 
Hi,

I'm running Xymon since 6 years (4.3.17 atm) on Debian 7.8 
3.2.0-4-amd64
Since 1 month now, every night, between 0h30 or 2h am at +/- 30 min, 
around 30 hosts become unreachable :

Fri Jan 29 01:16:38 2016 conn NOT ok : DNS lookup failed
Unable to resolve hostname foo.bar.local
System unreachable for 3 poll periods (170 seconds)
green 0.0.0.0 is alive (0.02 ms) [<- 127.0.0.1]


- Got around 500 monitored hosts and looks like the same hosts are 
lost every single night.
- Those monitored hosts are not necessary on the same network, not 
the same OS.
- We cross monitored the same hosts and the other monitoring tool 
doesn't have report the DNS outage.
- I ran a DNS lookup every seconds on the Hobbit server several days 
and it never reported a DNS outage.
- I don't have any crontab installed on the server who could disturb 
Xymon.
- Nothing strange in the Xymon logs nor the server logs, no memory 
leaks or CPU overloaded.
- The rest of the day, Xymon server behavior is normal.
- What I've done on the server 1 month ago ? I don't know, no system 
upgrade or so.
- I had DNSMASQ acting like a cache, I disabled it : same issue
- /etc/resolv.conf is quite light : search bar.local, next line : 
nameserver IP.OF.OUR.DNS.SERVER1, just like other servers

The issue could be anywhere : inside or outside the server, Xymon or 
not... I have to confess, I'm running out of ideas to find the issue, is 
anyone here may have some leads, I will be thankful !

Have a nice day!


-- 
Envoyé de mon appareil Android avec K-9 Mail. Veuillez excuser ma 
brièveté._______________________________________________
Xymon mailing list
Xymon at xymon.com
http://lists.xymon.com/mailman/listinfo/xymon





www.ergodirekt.de

Blog: http://blog.ergodirekt.de
Facebook: www.facebook.com/ERGODirekt
Google+: www.google.com/+ergodirekt 
Twitter: www.twitter.com/ERGODirekt
YouTube: www.youtube.com/ERGODirekt
_______________________

ERGO Direkt Lebensversicherung AG · Amtsgericht Fürth HRB 2787 · UST-ID-Nr. DE159593454
ERGO Direkt Versicherung AG · Amtsgericht Fürth HRB 2934 · UST-ID-Nr. DE159593438
ERGO Direkt Krankenversicherung AG · Amtsgericht Fürth HRB 4694 · UST-ID-Nr. DE159593446
Vorsitzender der Aufsichtsräte der ERGO Direkt Lebensversicherung AG und der ERGO Direkt Krankenversicherung AG: Dr. Clemens Muth
Vorsitzender des Aufsichtsrats der ERGO Direkt Versicherung AG: Christian Diedrich
Vorstände: Peter Stockhorst (Vorsitzender), Ralf Hartmann, Dr. Jörg Stoffels · Sitz: Fürth
Karl-Martell-Straße 60 · 90344 Nürnberg · Internet: ergodirekt.de
UniCredit Bank AG - HypoVereinsbank Kto.-Nr.: 66 071 430 · BLZ 700 202 70
IBAN: DE63 7002 0270 0066 0714 30 · BIC: HYVEDEMM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20160216/8c48fe79/attachment.html>


More information about the Xymon mailing list