[Xymon] how to learn what is crashing my rrd handler?
John Thurston
john.thurston at alaska.gov
Wed Aug 27 19:27:23 CEST 2014
On 8/26/2014 11:16 AM, J.C. Cleaver wrote:
> On Tue, August 26, 2014 11:04 am, John Thurston wrote:
>> I'm having difficulty with my RRD handlers crashing and leaving gaps in
>> my databases.
>>
>> I mentioned this back in April, 2014 but received no responses:
>> http://lists.xymon.com/pipermail/xymon/2014-April/039547.html
- snip -
>> I _suspect_ it is another client sending me empty messages, but how do I
>> find it now that I have several hundred clients sending "data" messages?
>> --
> As an initial step, run xymond_rrd in --debug mode... You can send the pid
> a -USR2 signal to toggle this setting without bouncing the process itself
> (be sure you're signalling xymond_rrd and not its xymond_channel parent).
Ahh. I figured out what I was doing wrong. I was placing --debug in the
wrong place on the line. When I put it at the end, I get debug output
from the 'data' handler rather than the parent.
When I do so, the log contains:
> 2014-08-27 09:02:07 Peer not up, flushing message queue
> 2014-08-27 09:02:08 Peer not up, flushing message queue
> 2014-08-27 09:02:09 Peer not up, flushing message queue
> 2014-08-27 09:02:10 Peer not up, flushing message queue
> 2014-08-27 09:02:10 Peer not up, flushing message queue
> 2014-08-27 09:02:16 Peer not up, flushing message queue
> 2014-08-27 09:02:17 Peer not up, flushing message queue
> 4595 2014-08-27 09:02:19 Opening file /opt/xymon/server/etc/rrddefinitions.cfg
> 4595 2014-08-27 09:02:19 Want msg 1, startpos 0, fillpos 0, endpos -1, usedbytes=0, bufleft=528383
> 4595 2014-08-27 09:02:19 Got 230 bytes
> 4595 2014-08-27 09:02:19 xymond_rrd: Got message 2103 @@data#2103/soapsgdc02.soa.alaska.gov|1409158937.159587|10.210.36.22||soapsgdc02.soa.alaska.gov|trends||ETS/MsgDir
> 4595 2014-08-27 09:02:19 startpos 230, fillpos 230, endpos -1
> 4595 2014-08-27 09:02:19 Transport setup is:
> 4595 2014-08-27 09:02:19 xymondportnumber = 1984
> 4595 2014-08-27 09:02:19 xymonproxyhost = NONE
> 4595 2014-08-27 09:02:19 xymonproxyport = 0
> 4595 2014-08-27 09:02:19 Recipient listed as '146.63.81.42'
> 4595 2014-08-27 09:02:19 Standard protocol on port 1984
> 4595 2014-08-27 09:02:19 Will connect to address 146.63.81.42 port 1984
> 4595 2014-08-27 09:02:19 Connect status is 0
> 4595 2014-08-27 09:02:19 Sent 16 bytes
> 4595 2014-08-27 09:02:19 Read 8192 bytes
> 4595 2014-08-27 09:02:19 Read 32767 bytes
> 4595 2014-08-27 09:02:19 Read 28440 bytes
> 4595 2014-08-27 09:02:19 Closing connection
> 4595 2014-08-27 09:02:19 Opening file /opt/xymon/server/etc/analysis.cfg
> 4595 2014-08-27 09:02:19 Opening file /opt/xymon/server/etc/protocols.cfg
> 4595 2014-08-27 09:02:19 Opening file /opt/xymon/server/etc/protocols.d/soaprotocols.cfg
> 4595 2014-08-27 09:02:19 Service list dump
> 4595 2014-08-27 09:02:19 Name : ftp
> 4595 2014-08-27 09:02:19 Sendtext: quit\r\n
> 4595 2014-08-27 09:02:19 Sendlen : 6
> 4595 2014-08-27 09:02:19 Exp.text: 220
> 4595 2014-08-27 09:02:19 Exp.len : 3
> 4595 2014-08-27 09:02:19 Exp.ofs : 0
> 4595 2014-08-27 09:02:19 Flags : 1
> 4595 2014-08-27 09:02:19 Port : 21
> 4595 2014-08-27 09:02:19 Name : ftps
> 4595 2014-08-27 09:02:19 Sendtext: quit\r\n
> 4595 2014-08-27 09:02:19 Sendlen : 6
> 4595 2014-08-27 09:02:19 Exp.text: 220
> 4595 2014-08-27 09:02:19 Exp.len : 3
> 4595 2014-08-27 09:02:19 Exp.ofs : 0
> 4595 2014-08-27 09:02:19 Flags : 5
> 4595 2014-08-27 09:02:19 Port : 990
> 4595 2014-08-27 09:02:19 Name : ssh
> 4595 2014-08-27 09:02:19 Sendtext: SSH-2.0-OpenSSH_4.1\r\n
> 4595 2014-08-27 09:02:19 Sendlen : 21
> 4595 2014-08-27 09:02:19 Exp.text: SSH
> 4595 2014-08-27 09:02:19 Exp.len : 3
> 4595 2014-08-27 09:02:19 Exp.ofs : 0
> 4595 2014-08-27 09:02:19 Flags : 1
> 4595 2014-08-27 09:02:19 Port : 22
> 4595 2014-08-27 09:02:19 Name : ssh1
> 4595 2014-08-27 09:02:19 Sendtext: SSH-2.0-OpenSSH_4.1\r\n
> 4595 2014-08-27 09:02:19 Sendlen : 21
> 4595 2014-08-27 09:02:19 Exp.text: SSH
> 4595 2014-08-27 09:02:19 Exp.len : 3
> 4595 2014-08-27 09:02:19 Exp.ofs : 0
> 4595 2014-08-27 09:02:19 Flags : 1
> 4595 2014-08-27 09:02:19 Port : 22
> 4595 2014-08-27 09:02:19 Name : ssh2
> 4595 2014-08-27 09:02:19 Sendtext: SSH-2.0-OpenSSH_4.1\r\n
> 4595 2014-08-27 09:02:19 Sendlen : 21
> 4595 2014-08-27 09:02:19 Exp.text: SSH
> 4595 2014-08-27 09:02:19 Exp.len : 3
> 4595 2014-08-27 09:02:19 Exp.ofs : 0
> 4595 2014-08-27 09:02:19 Flags : 1
> 4595 2014-08-27 09:02:19 Port : 22
> 4595 2014-08-27 09:02:19 Name : telnet
> 4595 2014-08-27 09:02:19 2014-08-27 09:02:21 Child process 4595 died: Signal 6
> 2014-08-27 09:02:24 Peer at 0.0.0.0:0 failed: Broken pipe
> 2014-08-27 09:02:24 Peer not up, flushing message queue
> 2014-08-27 09:02:24 Peer not up, flushing message queue
and the stack from the core file (using pstack)
> fee5ebd4 _lwp_kill (6, 0, 0, fee3e0f0, ffffffff, 6) + 8
> fedd29f0 abort (0, 1, 6666c, ffb04, feed5518, 0) + 110
> 0003bb94 sigsegv_handler (b, 0, ffbfb588, 1, 0, 544a8) + 30
> fee5b00c __sighndlr (b, 0, ffbfb588, 3bb64, 0, 1) + c
> fee4f6bc call_user_handler (b, 0, 0, 0, fed32a00, ffbfb588) + 3b8
> fee4f8a4 sigacthandler (b, 0, ffbfb588, 20, 0, 0) + 60
> --- called from signal handler with signal 11 (SIGSEGV) ---
> fedc2d50 strlen (53e37, ffbfc804, ffbfbdf9, 0, 0, 0) + 50
> fee319d4 vfprintf (71990, 53e28, ffbfc800, 0, a0afc, fee314d4) + ec
> 0002f24c dbgprintf (53e28, 0, e9768, 6f800, 71990, 6d000) + a0
> 0003347c dump_tcp_services (53e88, 53ea0, 53eb8, c0, a0, 67f98) + a0
> 00033d70 init_tcp_services (168a78, 620, 67f98, 54060, 600, 168430) + 848
> 0002f858 rrd_setup (98906, 6d000, 6d000, 80808080, 6d000, 0) + 164
> 0002fc4c find_xymon_rrd (988f4, 492e8, 53fe08c6, 53fe08c6, 988c2, 2e) + 4
> 00048cb0 main (98907, ffbfdba4, 988fc, 68800, 3, 49528) + 728
> 00015d2c _start (0, 0, 0, 0, 0, 0) + 5c
Which, if I'm reading it correctly, makes me think the application tried
to read off the end of a string.
--
Do things because you should, not just because you can.
John Thurston 907-465-8591
John.Thurston at alaska.gov
Enterprise Technology Services
Department of Administration
State of Alaska
More information about the Xymon
mailing list