[Xymon] Scaling

Olivier AUDRY olivier at audry.fr
Thu Apr 11 19:29:37 CEST 2013


hello

I impressed with your 2k incoming message. I only got 600 and we have
a lot of gap in our trends.

I suspect xymonproxy to add latency into the process or our huge and
historical extra-rrd.pl

We don't have load or iowait.

I'm not sure that it could be network issue. So if you have an idee :)

oau

Le jeudi 11 avril 2013 à 17:18 +0000, cleaver at terabithia.org a écrit :
> > On Wed, Apr 10, 2013 at 5:51 PM, White, Bruce <bewhite at fellowes.com>
> > wrote:
> >
> >> Over 1000 devices monitored here and only real issue is rrd keeping up.
> >> I
> >> have been told an ssd for the rrd files will solve this issue.
> >>
> >
> >
> > ~2000 hosts and that will double or triple in the next few weeks. I really
> > don't see any IO issues in the slightest.
> > 6 x 15k RPM SCSI drives in Raid 5 on a Dell PowerEdge 2950 with 8 gigs of
> > ram and the thing is snoring (LA: 0.25)
> >
> > Regards,
> > Cami
> 
> 
> We're currently processing ~2K incoming messages a second on a single
> xymond instance. This is a pretty beefy box, but it's also handling lots
> of other concurrent monitoring tasks that we're slowly moving over to
> xymon... including a non-fping-enabled Icinga install >.<
> 
> ]# xymon localhost "xymondboard test=info fields=hostname" | wc -l
> 42459
> 
> (Not all of those are full hosts; some are application nodes with statuses
> being generated server-side out of client-side jvm stats or the like.)
> 
> 
> At these levels it's important to ensure you're using whatever NUMA
> capabilities your system has properly, since message passing is basically
> just shoveling incoming TCP data around within memory. Also, you might
> want to tweak net.ipv4.ip_local_port_range and enable
> net.ipv4.tcp_tw_reuse and/or net.ipv4.tcp_tw_recycle on Linux to eke more
> simultaneous testing out of xymonnet.
> 
> 
> One of the beauties of Xymon's architecture is the ability to cleanly
> disconnect the components... Xymongen can run on some other box,
> xymond_locator can be used to send rrd data off somewhere if IO becomes an
> issue, xymonnet pollers can be distributed, and xymonproxy can be used as
> needed to aggregate and smooth out incoming status reports, etc.
> 
> There are lots of different mechanisms for "scaling" efficiently depending
> on your particular needs, but I'd bet that on decently modern server
> hardware you'll probably want to scale for HA purposes long before you
> actually /need/ the additional power.
> 
> 
> HTH,
> 
> -jc
> _______________________________________________ Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon




More information about the Xymon mailing list