[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [hobbit] RHEL5 and status-board not available bug?



On Tue, 10 Feb 2009 16:39:35 +0100, Henrik wrote:

>I'm not completely sure if you believe there is a bug in Xymon,
>or in the Linux kernel of your RHEL system ... 

I think is in Hobbit. And I have news about it, I'll write more down.

>3000 hosts is a fairly large setup. I assume you're doing data
>collection for graphs for all of these servers, and that you're
>running version 4.2.x of Xymon.

Correct, I'll try the 4.3 in lab next week now that I know how
the "bug" works.

>I would guess that your problems - at least in part - stem from 
>the amount of I/O you're doing for updating all of the RRD-files.

No, excluded at all, already tried to disable all the ext tests.
However I tried also switching the data in local SCSI disks
and iostat indicate a really low I/O wait.

> If the problem persists, then some other explanation must be found.

Must for sure....it's a big trouble saw 3000 hosts becaming purple
then green then purple :)

>I don't know how the change in "POSIX threading" plays into this.
>Hobbit is not a threaded application, it is plain and simple 
>single-task application all the way through. It may have some
>meaning in relation to NFS.

Ups...is not a multithread? I'm not a programmer but....how it can
follow 3000 hosts sending data without multithread?

However here the news: the problem persist just with RHEL5 with
architecture x86_64 with all kind of 2.6 kernels.
With RHEL5 and x86 (32bit) there isn't the bug.
I would like to try a Fedora on my notebook....I'll let you know.
For us the best resolutions is to reinstall all in 32 bit, I'm
already working on it (the first server it's already up, hobbit
now it's working correctly just with this "little" edit)

However, the problem exist also in our hobbit lab (always 64bit)
stressing the Hobbit with more than 20 "virtual host"
Be sure of one things: is not a hardware or bottleneck related problem,
the bottleneck was before on a old machine with a I/O wait really
hight, now with this two new servers is not.

However, there is someone with a x86_64 architecture with similar problems?
And if someone have a Redhat Developper support license, the RH
support teams already told me that they can work on it.

Have a nice evening.

-- 
Be Yourself @ mail.com!
Choose From 200+ Email Addresses
Get a Free Account at www.mail.com