[Xymon] XYMON Proxy Issue

Gautier Begin gbegin at csc.com
Mon May 5 10:12:22 CEST 2014


Andy,

I'm using Solaris 10.5 in a cluster zone configuration. Both the main and 
the proxy server. I have also a little proxy under Linux Ubuntu. 
XYMON version 4.3.12

Now, my proxy under Solaris is working fine with ~900 targets. Here are 
the different stepsI have done:

0- Use a tool to observe the behaviour of the network on the system. I 
used netstat on the zone and lsof -i :1984 on the global zone (physical 
node of the cluster)

 Here my perl script to be run on the zone (netstat):

$total = 0 ;
$big_total = 0 ;
@netstat = ` netstat -naP tcp ` ;
my %Con_Status ;
my %Con_Status_Total ;
foreach $ln (@netstat)
{
        chomp($ln) ;
        @elts = split(/ +/,$ln) ;
        if (( $#elts > 5 ) && ( $ln =~ /[0-9]+.*[A-Z]+/))
        {
                 $big_total++ ;
                 unless ( exists($Con_Status_Total{$elts[$#elts]}) )
                {
                        $Con_Status_Total{$elts[$#elts]} = 1 ;
                } else {
                        $Con_Status_Total{$elts[$#elts]} = 
$Con_Status_Total{$elts[$#elts]} + 1 ;
                }

        }

        if ( $ln =~ /\.1984 +/ )
        {

                unless ( exists($Con_Status{$elts[$#elts]}) )
                {
                        $Con_Status{$elts[$#elts]} = 1 ;
                } else {
                        $Con_Status{$elts[$#elts]} = 
$Con_Status{$elts[$#elts]} + 1 ;
                }

        }


}


print " State\t\tPort 
1984\tTotal\n=======================================\n" ;
foreach $Conn_State (sort keys %Con_Status_Total )
{
         unless ( exists($Con_Status{$Conn_State}) ) { 
$Con_Status{$Conn_State} = 0 ; }
        if ( length($Conn_State) < 7 ) { $col = "\t\t" ; } else { $col = 
"\t"  ; }
        print " 
$Conn_State$col$Con_Status{$Conn_State}\t\t$Con_Status_Total{$Conn_State}\n" 
;
        $total = $total + $Con_Status{$Conn_State} ;
}
print "=======================================\n 
TOTAL\t\t$total\t\t$big_total\n" ;



1- Tune and configure how Solaris manages the network using the ndd 
command:

ndd -set /dev/tcp tcp_time_wait_interval        2000
ndd -set /dev/tcp tcp_fin_wait_2_flush_interval 67500
ndd -set /dev/tcp tcp_ip_abort_interval         300000
ndd -set /dev/tcp tcp_keepalive_interval        7200000
ndd -set /dev/tcp tcp_rexmit_interval_max       4000
ndd -set /dev/tcp tcp_rexmit_interval_min       3000
ndd -set /dev/tcp tcp_rexmit_interval_initial   3000
ndd -set /dev/tcp tcp_smallest_anon_port        1024

ndd -set /dev/tcp tcp_conn_req_max_q    2048
ndd -set /dev/tcp tcp_conn_req_max_q0   4096
ndd -set /dev/tcp tcp_slow_start_initial        4

ndd -set /dev/tcp tcp_xmit_hiwat        262144
ndd -set /dev/tcp tcp_recv_hiwat        262144
ndd -set /dev/tcp tcp_max_buf   1048576



2- Modify the program xymonproxy.c

As I previously said, sockets are not well handled in this program 
(closure not managed). Because I know very few about C programming, I just 
"arranged" the program, but it's remain a dirty solution.
=> so_linger, setsockopt part

I modified also line 973 and following because of verbose logging slowing 
done the proxy (select failed message). The best should be to solve to 
issue but I didn't.

# diff xymonproxy.c xymonproxy.c.ORIG
230d229
<         struct linger so_linger;
715,717d713
<                                       so_linger.l_onoff = 0;
<                               so_linger.l_linger = 10;
<                               setsockopt(cwalk->ssocket, SOL_SOCKET, 
SO_LINGER, &so_linger, sizeof(so_linger));

977,981c973,976
< /*            if (n < 0) {                      */
< /*                    errprintf("select() %d/%d failed: %s\n", n, maxfd, 
strerror(errno));    */
< /*            }                      */
< /*            else if (n == 0) {                      */
<               if (n == 0) {
---
>               if (n < 0) {
>                       errprintf("select() failed: %s\n", 
strerror(errno));
>               }
>               else if (n == 0) {
1001c996
<               else if ( n > 0 ) {
---
>               else {



3- XYMON proxy conf

Because of the large amount of targets:

In xymonserver.cfg, of the proxy, I put MAXMSGSPERCOMBO="500" .

In the xymonserver.cfg, of the main server, I put

MAXMSGSPERCOMBO="500"

MAXLINE="5242880"
MAXMSG_CLIENT="5242880"
MAXMSG_DATA="5242880"
MAXMSG_STACHG="5242880"
MAXMSG_STATUS="5242880"
MAXMSG_NOTES="5242880"
MAXMSG_PAGE="5242880"
MAXMSG_ENADIS="5242880"
MAXMSG_CLICHG="5242880"


This part is not realy tunned (figures should be too large) but it's 
working.


Cordialement, Regards,Mit freundlichen Grüßen,

Gautier BEGIN

System Tools Team Lead
CACEIS and APERAM accounts
CSC Computer Sciences Luxembourg S.A.
12D Impasse Drosbach
L-1882 Luxembourg

Global Outsourcing Service | p:+352 24 834 276 | m:+352 621 229 172 | 
gbegin at csc.com | www.csc.com


CSC • This is a PRIVATE message. If you are not the intended recipient, 
please delete without copying and kindly advise us by e-mail of the 
mistake in delivery.  NOTE: Regardless of content, this e-mail shall not 
operate to bind CSC to any order or other contract unless pursuant to 
explicit written agreement or government initiative expressly permitting 
the use of e-mail for such purpose
 • 
CSC Computer Sciences SAS • Registered Office: Immeuble Le Balzac, 10 
Place des Vosges, 92072 Paris La Défense Cedex, France • Registered in 
France: RCS Nanterre B 315 268 664



From:   Andy Smith <abs at shadymint.com>
To:     xymon at xymon.com
Date:   05/04/2014 02:50 PM
Subject:        Re: [Xymon] XYMON Proxy Issue
Sent by:        "Xymon" <xymon-bounces at xymon.com>



Hi,

In February, Gautier reported this issue with xymonproxy on Solaris :-

http://lists.xymon.com/pipermail/xymon/2014-February/039160.html

I have come this week to update an installation of 4.2.3 on Solaris 9 and 
have encountered the exact same issue as Gautier, but this time on the 
latest 4.3.17 code :-

2014-05-04 13:05:36 xymonproxy version 4.3.17 starting
2014-05-04 13:20:41 Listening on 0.0.0.0:1984
2014-05-04 13:20:41 Sending to Xymon server(s) xx.xx.xx.xx:1984
2014-05-04 13:20:41 select() failed: Invalid argument
2014-05-04 13:20:41 select() failed: Invalid argument
2014-05-04 13:20:41 select() failed: Invalid argument
2014-05-04 13:20:41 select() failed: Invalid argument
2014-05-04 13:20:41 select() failed: Invalid argument
2014-05-04 13:20:41 select() failed: Invalid argument
2014-05-04 13:20:41 Too many select failures, aborting
2014-05-04 13:20:46 xymonproxy version 4.3.17 starting

I do not suffer the connections in TIME_WAIT, just the constant restarting 
of the proxy every 15 minutes.  Here is the truss as it gasps when falling 
over :-

poll(0xFFBFF208, 1, 1000)                       = 0
time()                                          = 1399206937
poll(0xFFBFF208, 1, 1000)                       = 0
time()                                          = 1399206938
poll(0xFFBFF208, 1, 1000)                       = 0
time()                                          = 1399206939
poll(0xFFBFF208, 1, 1000)                       = 0
time()                                          = 1399206940
poll(0xFFBFF208, 1, 1000)                       = 0
time()                                          = 1399206941
poll(0xFFBFF208, 1, 1000)                       = 0
time()                                          = 1399206942
poll(0xFFBFF208, 1, 1000)                       = 1
accept(3, 0x0003AC60, 0xFFBFF310, 1)            = 4
fcntl(4, F_SETFL, 0x00000080)                   = 0
time()                                          = 1399206942
poll(0xFFBFF200, 2, 1000)                       = 1
read(4, " s t a t u s + 4 5   c s".., 8185)     = 140
time()                                          = 1399206942
poll(0xFFBFF200, 2, 1000)                       = 1
read(4, 0x00038CE2, 8045)                       = 0
time()                                          = 1399206942
shutdown(4, 2, 1)                               = 0
close(4)                                        = 0
poll(0xFFBFF208, 1, 1000)                       = 1
accept(3, 0x0003ACD0, 0xFFBFF310, 1)            = 4
fcntl(4, F_SETFL, 0x00000080)                   = 0
time()                                          = 1399206942
time()                                          = 1399206942
write(2, " 2 0 1 4 - 0 5 - 0 4   1".., 19)      = 19
write(2, "  ", 1)                               = 1
write(2, " s e l e c t ( )   f a i".., 34)      = 34
time()                                          = 1399206942
time()                                          = 1399206942
write(2, " 2 0 1 4 - 0 5 - 0 4   1".., 19)      = 19
write(2, "  ", 1)                               = 1
write(2, " s e l e c t ( )   f a i".., 34)      = 34
time()                                          = 1399206942
time()                                          = 1399206942
write(2, " 2 0 1 4 - 0 5 - 0 4   1".., 19)      = 19
write(2, "  ", 1)                               = 1
write(2, " s e l e c t ( )   f a i".., 34)      = 34
time()                                          = 1399206942
time()                                          = 1399206942
write(2, " 2 0 1 4 - 0 5 - 0 4   1".., 19)      = 19
write(2, "  ", 1)                               = 1
write(2, " s e l e c t ( )   f a i".., 34)      = 34
time()                                          = 1399206942
time()                                          = 1399206942
write(2, " 2 0 1 4 - 0 5 - 0 4   1".., 19)      = 19
write(2, "  ", 1)                               = 1
write(2, " s e l e c t ( )   f a i".., 34)      = 34
time()                                          = 1399206942
time()                                          = 1399206942
write(2, " 2 0 1 4 - 0 5 - 0 4   1".., 19)      = 19
write(2, "  ", 1)                               = 1
write(2, " s e l e c t ( )   f a i".., 34)      = 34
time()                                          = 1399206942
write(2, " 2 0 1 4 - 0 5 - 0 4   1".., 19)      = 19
write(2, "  ", 1)                               = 1
write(2, " T o o   m a n y   s e l".., 35)      = 35
_exit(1)

So, question to Gautier, are you using Solaris 9 and have you managed to 
resolve this?

Another question to the rest of the list, this is actually the only proxy 
I have on Solaris, all the otehrs are on Redhat, is anyone else using 
xymonproxy on Solaris and if so, what version?  For the time being, I am 
running the old bbproxy until I get this fixed, the rest of 4.3.17 seems 
to be working OK.

Thanks for any feedback.
-- 
Andy

_______________________________________________
Xymon mailing list
Xymon at xymon.com
http://lists.xymon.com/mailman/listinfo/xymon


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20140505/afa790a0/attachment.html>


More information about the Xymon mailing list