[Xymon] XYMON Proxy Issue
Andy Smith
abs at shadymint.com
Mon May 5 11:52:06 CEST 2014
Gautier Begin wrote:
> Andy,
>
> I'm using Solaris 10.5 in a cluster zone configuration. Both the main
> and the proxy server. I have also a little proxy under Linux Ubuntu.
> XYMON version 4.3.12
>
> Now, my proxy under Solaris is working fine with ~900 targets. Here are
> the different stepsI have done:
>
> *0- Use a tool to observe the behaviour of the network* on the system. I
> used netstat on the zone and lsof -i :1984 on the global zone (physical
> node of the cluster)
>
> Here my perl script to be run on the zone (netstat):
>
> /$total = 0 ;/
> /$big_total = 0 ;/
> /@netstat = ` netstat -naP tcp ` ;/
> /my %Con_Status ;/
> /my %Con_Status_Total ;/
> /foreach $ln (@netstat)/
> /{/
> / chomp($ln) ;/
> / @elts = split(/ +/,$ln) ;/
> / if (( $#elts > 5 ) && ( $ln =~ /[0-9]+.*[A-Z]+/))/
> / {/
> / $big_total++ ;/
> / unless ( exists($Con_Status_Total{$elts[$#elts]}) )/
> / {/
> / $Con_Status_Total{$elts[$#elts]} = 1 ;/
> / } else {/
> / $Con_Status_Total{$elts[$#elts]} =
> $Con_Status_Total{$elts[$#elts]} + 1 ;/
> / }/
>
> / }/
>
> / if ( $ln =~ /\.1984 +/ )/
> / {/
>
> / unless ( exists($Con_Status{$elts[$#elts]}) )/
> / {/
> / $Con_Status{$elts[$#elts]} = 1 ;/
> / } else {/
> / $Con_Status{$elts[$#elts]} =
> $Con_Status{$elts[$#elts]} + 1 ;/
> / }/
>
> / }/
>
>
> /}/
>
>
> /print " State\t\tPort
> 1984\tTotal\n=======================================\n" ;/
> /foreach $Conn_State (sort keys %Con_Status_Total )/
> /{/
> / unless ( exists($Con_Status{$Conn_State}) ) {
> $Con_Status{$Conn_State} = 0 ; }/
> / if ( length($Conn_State) < 7 ) { $col = "\t\t" ; } else { $col
> = "\t" ; }/
> / print "
> $Conn_State$col$Con_Status{$Conn_State}\t\t$Con_Status_Total{$Conn_State}\n"
> ;/
> / $total = $total + $Con_Status{$Conn_State} ;/
> /}/
> /print "=======================================\n
> TOTAL\t\t$total\t\t$big_total\n" ;/
>
>
>
> *1- Tune and configure how Solaris manages the network *using the ndd
> command:
>
> /ndd -set /dev/tcp tcp_time_wait_interval 2000/
> /ndd -set /dev/tcp tcp_fin_wait_2_flush_interval 67500/
> /ndd -set /dev/tcp tcp_ip_abort_interval 300000/
> /ndd -set /dev/tcp tcp_keepalive_interval 7200000/
> /ndd -set /dev/tcp tcp_rexmit_interval_max 4000/
> /ndd -set /dev/tcp tcp_rexmit_interval_min 3000/
> /ndd -set /dev/tcp tcp_rexmit_interval_initial 3000/
> /ndd -set /dev/tcp tcp_smallest_anon_port 1024/
>
> /ndd -set /dev/tcp tcp_conn_req_max_q 2048/
> /ndd -set /dev/tcp tcp_conn_req_max_q0 4096/
> /ndd -set /dev/tcp tcp_slow_start_initial 4/
>
> /ndd -set /dev/tcp tcp_xmit_hiwat 262144/
> /ndd -set /dev/tcp tcp_recv_hiwat 262144/
> /ndd -set /dev/tcp tcp_max_buf 1048576/
>
>
>
> *2- Modify the program xymonproxy.c*
>
> As I previously said, sockets are not well handled in this program
> (closure not managed). Because I know very few about C programming, I
> just "arranged" the program, but it's remain a dirty solution.
> => so_linger, setsockopt part
>
> I modified also line 973 and following because of verbose logging
> slowing done the proxy (select failed message). The best should be to
> solve to issue but I didn't.
>
> /# diff xymonproxy.c xymonproxy.c.ORIG/
> /230d229/
> /< struct linger so_linger;/
> /715,717d713/
> /< so_linger.l_onoff = 0;/
> /< so_linger.l_linger = 10;/
> /< setsockopt(cwalk->ssocket, SOL_SOCKET,
> SO_LINGER, &so_linger, sizeof(so_linger));/
>
> /977,981c973,976/
> /< /* if (n < 0) {
> *//
> /< /* errprintf("select() %d/%d failed: %s\n", n,
> maxfd, strerror(errno)); *//
> /< /* }
> *//
> /< /* else if (n == 0) {
> *//
> /< if (n == 0) {/
> /---/
> /> if (n < 0) {/
> /> errprintf("select() failed: %s\n",
> strerror(errno));/
> /> }/
> /> else if (n == 0) {/
> /1001c996/
> /< else if ( n > 0 ) {/
> /---/
> /> else {/
>
>
>
> *3- XYMON proxy conf*
>
> Because of the large amount of targets:
>
> In xymonserver.cfg, of the proxy, I put MAXMSGSPERCOMBO="500" .
>
> In the xymonserver.cfg, of the main server, I put
>
> MAXMSGSPERCOMBO="500"
>
> MAXLINE="5242880"
> MAXMSG_CLIENT="5242880"
> MAXMSG_DATA="5242880"
> MAXMSG_STACHG="5242880"
> MAXMSG_STATUS="5242880"
> MAXMSG_NOTES="5242880"
> MAXMSG_PAGE="5242880"
> MAXMSG_ENADIS="5242880"
> MAXMSG_CLICHG="5242880"
>
>
> This part is not realy tunned (figures should be too large) but it's
> working.
>
>
> Cordialement, Regards,Mit freundlichen Grüßen,
>
> Gautier BEGIN
>
> System Tools Team Lead
> CACEIS and APERAM accounts
> CSC Computer Sciences Luxembourg S.A.
> 12D Impasse Drosbach
> L-1882 Luxembourg
>
> Global Outsourcing Service | p:+352 24 834 276 | m:+352 621 229 172 |
> gbegin at csc.com | www.csc.com
>
>
> CSC • This is a PRIVATE message. If you are not the intended recipient,
> please delete without copying and kindly advise us by e-mail of the
> mistake in delivery. NOTE: Regardless of content, this e-mail shall not
> operate to bind CSC to any order or other contract unless pursuant to
> explicit written agreement or government initiative expressly permitting
> the use of e-mail for such purpose
> •
> CSC Computer Sciences SAS • Registered Office: Immeuble Le Balzac, 10
> Place des Vosges, 92072 Paris La Défense Cedex, France • Registered in
> France: RCS Nanterre B 315 268 664
>
>
>
> From: Andy Smith <abs at shadymint.com>
> To: xymon at xymon.com
> Date: 05/04/2014 02:50 PM
> Subject: Re: [Xymon] XYMON Proxy Issue
> Sent by: "Xymon" <xymon-bounces at xymon.com>
> ------------------------------------------------------------------------
>
>
>
> Hi,
>
> In February, Gautier reported this issue with xymonproxy on Solaris :-
> _
> __http://lists.xymon.com/pipermail/xymon/2014-February/039160.html_
>
> I have come this week to update an installation of 4.2.3 on Solaris 9
> and have encountered the exact same issue as Gautier, but this time on
> the latest 4.3.17 code :-
>
> 2014-05-04 13:05:36 xymonproxy version 4.3.17 starting
> 2014-05-04 13:20:41 Listening on _0.0.0.0:1984_ <http://0.0.0.0:1984/>
> 2014-05-04 13:20:41 Sending to Xymon server(s) xx.xx.xx.xx:1984
> 2014-05-04 13:20:41 select() failed: Invalid argument
> 2014-05-04 13:20:41 select() failed: Invalid argument
> 2014-05-04 13:20:41 select() failed: Invalid argument
> 2014-05-04 13:20:41 select() failed: Invalid argument
> 2014-05-04 13:20:41 select() failed: Invalid argument
> 2014-05-04 13:20:41 select() failed: Invalid argument
> 2014-05-04 13:20:41 Too many select failures, aborting
> 2014-05-04 13:20:46 xymonproxy version 4.3.17 starting
>
> I do not suffer the connections in TIME_WAIT, just the constant
> restarting of the proxy every 15 minutes. Here is the truss as it gasps
> when falling over :-
>
> poll(0xFFBFF208, 1, 1000) = 0
> time() = 1399206937
> poll(0xFFBFF208, 1, 1000) = 0
> time() = 1399206938
> poll(0xFFBFF208, 1, 1000) = 0
> time() = 1399206939
> poll(0xFFBFF208, 1, 1000) = 0
> time() = 1399206940
> poll(0xFFBFF208, 1, 1000) = 0
> time() = 1399206941
> poll(0xFFBFF208, 1, 1000) = 0
> time() = 1399206942
> poll(0xFFBFF208, 1, 1000) = 1
> accept(3, 0x0003AC60, 0xFFBFF310, 1) = 4
> fcntl(4, F_SETFL, 0x00000080) = 0
> time() = 1399206942
> poll(0xFFBFF200, 2, 1000) = 1
> read(4, " s t a t u s + 4 5 c s".., 8185) = 140
> time() = 1399206942
> poll(0xFFBFF200, 2, 1000) = 1
> read(4, 0x00038CE2, 8045) = 0
> time() = 1399206942
> shutdown(4, 2, 1) = 0
> close(4) = 0
> poll(0xFFBFF208, 1, 1000) = 1
> accept(3, 0x0003ACD0, 0xFFBFF310, 1) = 4
> fcntl(4, F_SETFL, 0x00000080) = 0
> time() = 1399206942
> time() = 1399206942
> write(2, " 2 0 1 4 - 0 5 - 0 4 1".., 19) = 19
> write(2, " ", 1) = 1
> write(2, " s e l e c t ( ) f a i".., 34) = 34
> time() = 1399206942
> time() = 1399206942
> write(2, " 2 0 1 4 - 0 5 - 0 4 1".., 19) = 19
> write(2, " ", 1) = 1
> write(2, " s e l e c t ( ) f a i".., 34) = 34
> time() = 1399206942
> time() = 1399206942
> write(2, " 2 0 1 4 - 0 5 - 0 4 1".., 19) = 19
> write(2, " ", 1) = 1
> write(2, " s e l e c t ( ) f a i".., 34) = 34
> time() = 1399206942
> time() = 1399206942
> write(2, " 2 0 1 4 - 0 5 - 0 4 1".., 19) = 19
> write(2, " ", 1) = 1
> write(2, " s e l e c t ( ) f a i".., 34) = 34
> time() = 1399206942
> time() = 1399206942
> write(2, " 2 0 1 4 - 0 5 - 0 4 1".., 19) = 19
> write(2, " ", 1) = 1
> write(2, " s e l e c t ( ) f a i".., 34) = 34
> time() = 1399206942
> time() = 1399206942
> write(2, " 2 0 1 4 - 0 5 - 0 4 1".., 19) = 19
> write(2, " ", 1) = 1
> write(2, " s e l e c t ( ) f a i".., 34) = 34
> time() = 1399206942
> write(2, " 2 0 1 4 - 0 5 - 0 4 1".., 19) = 19
> write(2, " ", 1) = 1
> write(2, " T o o m a n y s e l".., 35) = 35
> _exit(1)
>
> So, question to Gautier, are you using Solaris 9 and have you managed to
> resolve this?
>
> Another question to the rest of the list, this is actually the only
> proxy I have on Solaris, all the otehrs are on Redhat, is anyone else
> using xymonproxy on Solaris and if so, what version? For the time
> being, I am running the old bbproxy until I get this fixed, the rest of
> 4.3.17 seems to be working OK.
>
> Thanks for any feedback.
> --
> Andy
Gautier,
My issue is not a matter of performance or resource, I have only 3
servers in this DMZ, but thanks for the complete information. Also, it
is a concern that this still happens with recent versions of Solaris, I
would be prepared to accept that Solaris 9 might behave incorrectly but
I would have hoped that Solaris 10 might have fixed this.
Maybe I will go back to the differences between the code for bbproxy at
4.2.3 and xymonproxy at 4.3.17 for a clue as to what is going on.
--
Andy
More information about the Xymon
mailing list