RHEL5 and status-board not available bug?

Flyzone Micky flyzone at technologist.com
Tue Feb 10 08:35:24 CET 2009


Well...We think it's a big bug, where 'we' is me and RedHat support.
Of course I'm speaking of Linux and not about the Solaris bug,
and my kernel parameter are ok.

I moved from a rhel4.5 with kernel 2.6.9-55 to a rhel5.3 with 
kernel 2.6.18-128 with bonding (active-passive) gigabit ethernet, 
and nfs files storing the xymon data in a Veritas cluster.
The xymon server get 3000 hosts and about 17093 status messages.
The problem is...the timeout, the hobbit status page go in green,
the pages sometimes are slow to be read or give a "Status not
available"

Speaking with Redhat premium support, I sent them a trace of the
error (about 40MB gzip...) and for them the cause is a bug in the
thread management cause in the RHEL5 is not more possible to use
the old POSIX implementation of threading, but needs to use just
the Linux Threading "version". Of course I have lost some of the
sentences....sorry but I'm not a programmer.
They avoid at all a problem with the nfs share, the throughput of
xymon is about a stable 30KB/s, while network test indicate a 
possibility of 50-78MB/s. However I had to modify the mount option
to avoid many setattr calls.

As a workaround I have modify the sendmessage call in lib folder 
adding to repeat the send of message:
        if (res == BB_ETIMEOUT) {
                usleep(5);
                res = sendtomany((recipient ? recipient : bbdisp), xgetenv("BBDISPLAYS"), msg, respfd, respstr, fullresponse, timeout);
        }
This of course increase the busy time but doesn't get again an
"all system green" problem.
I'm running a xymon 4.2.0 with allinonepatch and xymon 4.2.2 
doesn't seem to have any changes in this problem however I'll 
try in the next days.
Other issue...shutting down xymon I always need to clear all
with ipcrm cause segments are yet present.
Nothing more in logs, just the status-board not available.

If someone already got this issue (doesn't seem in the past posts)
please give me a tip....
Ah..here my kernel parameter:

------ Shared Memory Limits --------
max number of segments = 8192
max seg size (kbytes) = 67108864
max total shared memory (kbytes) = 17179869184
min seg size (bytes) = 1

------ Semaphore Limits --------
max number of arrays = 128
max semaphores per array = 250
max semaphores system wide = 32000
max ops per semop call = 100
semaphore max value = 32767

------ Messages: Limits --------
max queues system wide = 16
max size of message (bytes) = 65536
default max size of queue (bytes) = 65536

Thanks in advance.

-- 
Be Yourself @ mail.com!
Choose From 200+ Email Addresses
Get a Free Account at www.mail.com




More information about the Xymon mailing list