[Xymon] XyMon System Requirements
Henrik Størner
henrik at hswn.dk
Thu Dec 22 09:23:03 CET 2011
Hi Matthew,
On 22-12-2011 00:27, Matthew Neumark wrote:
> Currently I'm managing a XyMon server which consists of around 5,000
> devices. We are looking to keep continually adding more and more devices
> to it as time goes on. The issue is our system is currently always using
> max system resources we keep allocating to the server. BTW devmon seems
> to be the highest system resource hog.
> Stats:
> About 5,000 devices
> XyMon 4.3.0-0-beta2
> DevMon 0.2
> 4 CPU(s)
> 16 GB RAM
> Suse Linux Enterprise 10 (32-Bit)
> 300GB Enterprise SAN Storage - Fiber Channel - (3 Years of archived data
> stored)
> Do it do me any good to give the system more resources? CPU(s) or RAM?
> What is the experience that other users have with monitoring this many
> devices?
> What system configurations are you using to support this many monitored
> hosts?
Your installation is about the same size as the one I have at work. I
recently upgraded it because it could no longer keep up with the load,
and based on that I would say that your hardware specs should be more
than adequate to handle the number of devices.
The only real difference between your system and mine is that I changed
to SSD disks for storing the RRD-files (graphs) - I don't know how your
FC disks compare with SSD's, but I could certainly see a significant
effect of that change; when stopping Xymon it used to take 15-20 minutes
for xymond_rrd to flush all of the cached RRD updates to disk, but after
changing to the SSD disks it only takes a few seconds. The interesting
thing of course is how long they will last, since the number of write
operations is limited on these devices; I plan to replace them once a
year to be on the safe side.
Have a look at your vmstat1 graph for the Xymon server (on the "trends"
status page), and see how much time is being spent in I/O wait state -
if that is in the 20-25% range, then you probably have a problem with
I/O bandwidth, and adding an SSD disk could help. (I say 20-25% because
as far as I know, Linux sends all I/O operations through one CPU, so if
you have 4 CPU's and one of them is fully busy doing I/O, then it will
show up in vmstat as I/O wait taking up 1/4 if the time).
Is there any swap being used ? (Check the "free" output). I wouldn't
expect that there is much swap going on with 16 GB of RAM. So more RAM
probably will not help.
I've never used Devmon, so I don't know how much of a "hog" it is. If it
really is the one using all of the ressources, a solution (or
workaround, really) might be to split the Devmon load between more
servers - you can still have them report their data to the same Xymon
server, you will only move the running of Devmon to a different node.
Just for the record, my current system is an HP DL380 G7, 2 dual-core
2.4 GHz CPU's, 24 GB RAM, 6x300 GB SAS 10K diske in a RAID-1
configuration, and 2 64 GB SSD disks in RAID-1. It is currently handling
about 4200 servers with clients installed, and an additional 3000
entries in hosts.cfg for network devices, websites etc. All in all 50000
statuses being tracked. On average, the CPU load is 6% busy. To be fair,
I must add that I have most of the network tests running on another node
(for ease of firewall setup, mostly) and that node is 15% busy.
Regards,
Henrik
More information about the Xymon
mailing list