[Xymon] Bug? Dropping host doesn't remove hostdata

John Horne john.horne at plymouth.ac.uk
Mon Jul 22 19:57:14 CEST 2013


On Mon, 2013-07-22 at 12:21 +0100, John Horne wrote:
>
> I've started xymond_hostdata with the '--debug' option, but am not
> seeing the '@@drophost' command being received. The main xymond log sees
> the actual 'drop jhvm2' command though ('jhvm2' is the test host name).
> So I am currently testing the above bit of code to see that it is
> actually being reached.
> 
Well I'm a bit stumped with this.

I have added several dbgprintf statements (which begin with 'JH:') to
both xymond.c and xymond_hostdata.c. I also modified tasks.cfg so that
xymond and xymond_hostdata started with the '--debug' option.

The log files show the command being received and sent to the xymon
channels. However, the hostdata.log does not show it being received.

>From xymond.log:

====================================
29942 2013-07-22 18:32:59 -> do_message/1 (10 bytes): drop jhvm2
29942 2013-07-22 18:32:59 -> update_statistics
29942 2013-07-22 18:32:59 <- update_statistics
29942 2013-07-22 18:32:59 -> oksender
29942 2013-07-22 18:32:59 <- oksender(1-a)
29942 2013-07-22 18:32:59 -> handle_dropnrename
29942 2013-07-22 18:32:59 JH: In handle_dropnrename: host is jhvm2
29942 2013-07-22 18:32:59 JH: About to call posttochannel: statuschn
29942 2013-07-22 18:32:59 -> posttochannel
29942 2013-07-22 18:32:59 JH: In posttochannel: readymsg
29942 2013-07-22 18:32:59 JH: In posttochannel: command is:
@@drophost#195/*|1374514379.136454|141.163.66.133|jhvm2
29942 2013-07-22 18:32:59 Posting message 195 to 1 readers
29942 2013-07-22 18:32:59 <- posttochannel
29942 2013-07-22 18:32:59 JH: About to call posttochannel: stachgchn
29942 2013-07-22 18:32:59 -> posttochannel
29942 2013-07-22 18:32:59 JH: In posttochannel: readymsg
29942 2013-07-22 18:32:59 JH: In posttochannel: command is:
@@drophost#195/*|1374514379.136550|141.163.66.133|jhvm2
29942 2013-07-22 18:32:59 Posting message 195 to 1 readers
29942 2013-07-22 18:32:59 <- posttochannel
29942 2013-07-22 18:32:59 JH: About to call posttochannel: pagechn
29942 2013-07-22 18:32:59 -> posttochannel
29942 2013-07-22 18:32:59 JH: In posttochannel: readymsg
29942 2013-07-22 18:32:59 JH: In posttochannel: command is:
@@drophost#1/*|1374514379.136624|141.163.66.133|jhvm2
29942 2013-07-22 18:32:59 Posting message 1 to 1 readers
29942 2013-07-22 18:32:59 <- posttochannel
29942 2013-07-22 18:32:59 JH: About to call posttochannel: datachn
29942 2013-07-22 18:32:59 -> posttochannel
29942 2013-07-22 18:32:59 JH: In posttochannel: readymsg
29942 2013-07-22 18:32:59 JH: In posttochannel: command is:
@@drophost#22/*|1374514379.136739|141.163.66.133|jhvm2
29942 2013-07-22 18:32:59 Posting message 22 to 1 readers
29942 2013-07-22 18:32:59 <- posttochannel
29942 2013-07-22 18:32:59 JH: About to call posttochannel: noteschn
29942 2013-07-22 18:32:59 -> posttochannel
29942 2013-07-22 18:32:59 Dropping message - no readers
29942 2013-07-22 18:32:59 JH: About to call posttochannel: enadischn
29942 2013-07-22 18:32:59 -> posttochannel
29942 2013-07-22 18:32:59 Dropping message - no readers
29942 2013-07-22 18:32:59 JH: About to call posttochannel: clientchn
29942 2013-07-22 18:32:59 -> posttochannel
29942 2013-07-22 18:32:59 JH: In posttochannel: readymsg
29942 2013-07-22 18:32:59 JH: In posttochannel: command is:
@@drophost#4/*|1374514379.136890|141.163.66.133|jhvm2
29942 2013-07-22 18:32:59 Posting message 4 to 1 readers
29942 2013-07-22 18:32:59 <- posttochannel
====================================

Basically this shows 'dropnrename' being called with the host name
'jhvm2'. It then calls 'posttochannel' for each channel, where the
message is either dropped if there are no readers, or is sent on with
the '@@drophost... jhvm2' command.


>From hostdata.log:

====================================
21100 2013-07-22 18:07:59 JH: xymond_hostdata starting: clientlogdir
is: /home/xymon/data/hostdata
21100 2013-07-22 18:07:59 Want msg 1, startpos 0, fillpos 0, endpos -1,
usedbytes=0, bufleft=2101247
21100 2013-07-22 18:07:59 Got 44 bytes
21100 2013-07-22 18:07:59 xymond_hostdata: Got message 1 @@shutdown#1/*|
1374512879.482598|xymond|
21100 2013-07-22 18:07:59 startpos 44, fillpos 44, endpos -1
2013-07-22 18:31:16 Peer not up, flushing message queue
====================================



So the command is accepted by xymond and sent on, but not received by
xymond_hostdata.

Unfortunately (?) this IPC is controlled by semaphores, so seeing as to
why xymond_hostdata does not pick up the message may be difficult.





John.

-- 
John Horne, Plymouth University, UK
Tel: +44 (0)1752 587287    Fax: +44 (0)1752 587001




More information about the Xymon mailing list