[Xymon] Flushing Stale messages?
    cleaver at terabithia.org 
    cleaver at terabithia.org
       
    Fri Mar 15 18:19:27 CET 2013
    
    
  
Yeah, that generally means your pipe has backed up too much.
"Rate of messages" is a good metric to keep track of (visible at 5m
intervals from the xymond status report). If you're getting 3000 messages
every 300 seconds, that's 0.1s you've got to process each message coming
in on average, but subject to expected spikes and the buffers running
over.
Depending on what you're doing, smoothing out how often you're getting
messages to reduce spikes will help, as will filtering at xymond_channel
if you're only interesting in a subset, along with (obviously) trying to
make the message processor more efficient.
Eventually, it could lead to forking off the handling (if you can do it
efficiently and have cores to spare), or using an async queue somewhere.
On the second part, that's interesting... Can you provide a sample msg
with a null?
Regards,
-jc
--- Original Message ---
I'll answer that myself  yes that means whatever is there can't process
the channel fast enough
So, I'll have to go back to my older parser  which is getting this:
Core was generated by `xymond_mysql
--pidfile=/var/log/xymon/xymond_history.pid'.
Program terminated with signal 11, Segmentation fault.
#0  0x08049de1 in addnetpeer (peername=0x4f8ca0 "") at xymond_channel.c:140
140	xymond_channel.c: No such file or directory.
in xymond_channel.c
(gdb) where
#0  0x08049de1 in addnetpeer (peername=0x4f8ca0 "") at xymond_channel.c:140
#1  0x00511e9c in ?? ()
#2  0x004f8ca0 in ?? () from /lib/ld-linux.so.2
#3  0x08057190 in stackfgets (buffer=0x80497b0, extraincl=0x2 <Address 0x2
out of bounds>) at stackio.c:434
#4  0x080496c1 in _start ()
Which is getting a null timestamp for some items on stachg channel :/
From: <Clark>, Sean Clark <sean.clark at twcable.com>
Date: Friday, March 15, 2013 11:21 AM
To: "xymon at xymon.com" <xymon at xymon.com>
Subject: [Xymon] Flushing Stale messages?
I have a channel parser than looks at items in the 'stachg' channel
It looks like it's working for me (it parses and does stuff properly)
However  my log is filling up with this:
2013-03-15 11:08:29 Flushed 4 stale messages for 0.0.0.0:0
2013-03-15 11:08:30 Flushed 4 stale messages for 0.0.0.0:0
2013-03-15 11:08:31 Flushed 3 stale messages for 0.0.0.0:0
2013-03-15 11:08:32 Flushed 6 stale messages for 0.0.0.0:0
2013-03-15 11:08:33 Flushed 2 stale messages for 0.0.0.0:0
2013-03-15 11:08:34 Flushed 2 stale messages for 0.0.0.0:0
2013-03-15 11:08:35 Flushed 3 stale messages for 0.0.0.0:0
2013-03-15 11:08:36 Flushed 3 stale messages for 0.0.0.0:0
2013-03-15 11:08:37 Flushed 4 stale messages for 0.0.0.0:0
2013-03-15 11:08:38 Flushed 4 stale messages for 0.0.0.0:0
Is this telling my my parse can not handle the channel in a timely manner,
and the message is growing "stale" and I am droping things?
-Sean
    
    
More information about the Xymon
mailing list