[Xymon] Scaling

Tue Apr 16 17:04:15 CEST 2013

[Sorry to respond so late, I am catching up on emails]

I monitor about 43,000 devices split across 8 instances.
It runs on ancient hardware with 2 CPU, 8GB RAM, sun x4200's

I split RRD's to a different host, as well as xymongen and histfiles being
handled outside of stock xymon

The only issue I have run into (which I suspect will be fixed by beefier
hardware) is that once I get around 5,000 hosts, if xymon crashes, the
IPC/Shared Memory does not clean up right away, and it goes into a
continual restart process - henrik posted to the list earlier a way to
restart that kills all those things, so I haven't had issues since (still
tracking down what causes the crash)

On 4/11/13 4:23 PM, "Olivier AUDRY" <olivier at audry.fr> wrote:

>great many thx for your time I will check this
>
>> but there are only so many hours in the
>> day and there's other low-hanging fruit at the moment :)
>
>so true :)
>
>Le jeudi 11 avril 2013 à 20:12 +0000, cleaver at terabithia.org a écrit :
>> > Le jeudi 11 avril 2013 Ã  20:40 +0200, Olivier AUDRY a Ã©crit :
>>
>> > hello
>> >
>> > as I understand I should run xymon on a single node to improve memory
>> > access latency. Right ?
>> >
>> --snip--
>> >> numactl --hardware
>> >> available: 2 nodes (0-1)
>> >> node 0 size: 12097 MB
>> >> node 0 free: 594 MB
>> >> node 1 size: 12120 MB
>> >> node 1 free: 12 MB
>> >> node distances:
>> >> node   0   1
>> >>   0:  10  20
>> >>
>> >>
>> >> event I got 24 cpu. Multi core and hyperthreading. Is that correct ?
>>
>> That seems odd; almost like hyperthreading is disabled? You should see
>> "node 0 cpus: ..." above each size. I'm running RHEL 6.4; it's possible
>> things have changed in that output over time if you're on a different
>> system.
>>
>>
>> >>
>> >> As I can see my two node are full. Not good at all I guess.
>> >>
>> >> My policy is the default one. Perhaps you can advice a specific
>>policy
>> >> for a xymon setup ?
>> >>
>> >>  numactl --show
>> >> policy: default
>> >> preferred node: current
>> >> physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
>>22
>> >> 23
>> >> cpubind: 0 1
>> >> nodebind: 0 1
>> >> membind: 0 1
>>
>> Generally speaking, yeah, use numactl in front of xymonlaunch to ensure
>> the entire process tree gets assigned to a single node. But it really
>> depends on your workload (can everything fit in that node?) and what
>>else
>> is going on on the box. If you have something which analyzes xymondata
>>in
>> a large dump, then does heavy munging on it and sends it back, it might
>>be
>> better to have than on a different node than (say) the xymond_* worker
>> modules.
>>
>> 'numastat -s -z -p xymon' is your friend
>>
>> The RH Performance Tuning and Resource Management guides are definitely
>> useful reading as well. I'm sure there's plenty of cgroup stuff that
>>could
>> be helpful if/when the time came, but there are only so many hours in
>>the
>> day and there's other low-hanging fruit at the moment :)
>>
>> I'd definitely start with running the 'numad' service and seeing what it
>> does over time; it really could be all that you need.
>>
>> HTH,
>>
>> -jc
>>
>>
>
>_______________________________________________
>Xymon mailing list
>Xymon at xymon.com
>http://lists.xymon.com/mailman/listinfo/xymon

This E-mail and any of its attachments may contain Time Warner Cable proprietary information, which is privileged, confidential, or subject to copyright belonging to Time Warner Cable. This E-mail is intended solely for the use of the individual or entity to which it is addressed. If you are not the intended recipient of this E-mail, you are hereby notified that any dissemination, distribution, copying, or action taken in relation to the contents of and attachments to this E-mail is strictly prohibited and may be unlawful. If you have received this E-mail in error, please notify the sender immediately and permanently delete the original and any copy of this E-mail and any printout.