[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [hobbit] Xymon-4.3.0-beta1: hobbit_rrd data msgs truncated



Dominique Frise wrote:
Dominique Frise wrote:
Hi,

We track "surgemail" processes using following rule in hobbit-clients.cfg:

HOST=xyz
    PROC ./surgemail min=0 TRACK=surgemail

The ps listing in msg.xyz.txt reports 315 "./surgemail" processes, while the rrd graph only shows ~30 processes.

Here the last corresponding dataset of processes.surgemail.rrd file (after flushing the cache by stopping Xymon):

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE rrd SYSTEM "http://oss.oetiker.ch/rrdtool/rrdtool.dtd";>
<!-- Round Robin Database Dump --><rrd> <version> 0003 </version>
        <step> 300 </step> <!-- Seconds -->
<lastupdate> 1239775972 </lastupdate> <!-- 2009-04-15 08:12:52 CEST -->

        <ds>
                <name> count </name>
                <type> GAUGE </type>
                <minimal_heartbeat> 600 </minimal_heartbeat>
                <min> 0.0000000000e+00 </min>
                <max> NaN </max>

                <!-- PDP Status -->
                <last_ds> 30 </last_ds>
                <value> 5.1600000000e+03 </value>
                <unknown_sec> 0 </unknown_sec>
        </ds>

<!-- Round Robin Archives -->   <rra>

We tried to let Xymon recreate a fresh rrd without success.
The same configuration was working with Hobbit-4.2.0/RRDtool 1.2.19 (same version)

The rrd-code has pretty changed since 4.2.0 and I don't really see what code is involved to try debugging this.
Any help appreciated!

Dominique


This is a more general problem.
The data messages passed to hobbitd_rrd are truncated.

Debugging showed that messages are going correctly out of hobbitd but read incorrectly by hobbitd_channel.

Here below the debug output of hobbitd and hobbitd_channel with extra printf lines to dump the messages.

------ hobbitd.log --------
2009-04-17 16:22:21 <- do_message/1
2009-04-17 16:22:21 -> do_message/1 (86 bytes): data blind.ifstat
2009-04-17 16:22:21 -> update_statistics
2009-04-17 16:22:21 <- update_statistics
2009-04-17 16:22:21 -> oksender
2009-04-17 16:22:21 <- oksender(1-a)
2009-04-17 16:22:21 ->handle_data
2009-04-17 16:22:21 -> posttochannel
2009-04-17 16:22:21 Posting message 2 to 1 readers
2009-04-17 16:22:21 <- posttochannel
2009-04-17 16:22:21 <-handle_data
2009-04-17 16:22:21 msg: data blind.ifstat
solaris
bge:0:bge0:obytes64     267829127
bge:0:bge0:rbytes64     1208836563
2009-04-17 16:22:21 <- do_message/1
2009-04-17 16:22:21 -> do_message/1 (104 bytes): data blind.vmstat
2009-04-17 16:22:21 -> update_statistics
2009-04-17 16:22:21 <- update_statistics
2009-04-17 16:22:21 -> oksender
2009-04-17 16:22:21 <- oksender(1-a)
2009-04-17 16:22:21 ->handle_data
2009-04-17 16:22:21 -> posttochannel
2009-04-17 16:22:21 Posting message 3 to 1 readers
2009-04-17 16:22:21 <- posttochannel
2009-04-17 16:22:21 <-handle_data
2009-04-17 16:22:21 msg: data blind.vmstat
solaris
0 0 0 11938312 10700752 3 19 0 0 0 0 0 2 2 2 0 343 2099 1006 1 2 97
2009-04-17 16:22:21 <- do_message/1
2009-04-17 16:22:21 -> do_message/1 (1315 bytes): data blind.iostatdisk


------- rrd-data.log --------
2009-04-17 16:22:21 Peer not up, flushing message queue
2009-04-17 16:22:21 Connecting to peer 0.0.0.0:0
2009-04-17 16:22:21 Peer is UP
2009-04-17 16:22:21 inbuf: @@data#2/blind|1239978141.731166|130.223.27.23||blind|ifstat|sunos|intraDevServ,adminSys
data blind.ifstat
solaris
bge:0:bge0:obytes64     267829127
bge:0:bge0:rbytes64     12088365
@@

2009-04-17 16:22:21 inbuf: @@data#3/blind|1239978141.731938|130.223.27.23||blind|vmstat|sunos|intraDevServ,adminSys
data blind.vmstat
solaris
 0 0 0 11938312 10700752 3 19 0 0  0  0  0  2  2  2  0  343 2099 1006  1  2
@@


The last value of ifstat and vmstat (1208836563,97) becomes 12088365 and NULL respectively.
Hope Henrick can help us to solve this issue.

Dominique

To unsubscribe from the hobbit list, send an e-mail to
hobbit-unsubscribe (at) hswn.dk



Finally...found the issue in hobbitd.c
Patch hobbitd.patch is attached.

Installation
------------
Place in top Xymon install dir. and patch with:
# patch -p0 < hobbitd.patch
# gmake
Copy hobbitd to your install bin dir.


Dominique
--- hobbitd/hobbitd.c.dist	Mon Apr 20 15:51:44 2009
+++ hobbitd/hobbitd.c	Mon Apr 20 19:25:06 2009
@@ -1312,7 +1312,7 @@
 	if (msg) buflen += strlen(msg); else dbgprintf("  msg is NULL\n");
 	if (classname) buflen += strlen(classname);
 	if (pagepath) buflen += strlen(pagepath);
-	buflen += 4;
+	buflen += 6;
 
 	chnbuf = (char *)malloc(buflen);
 	snprintf(chnbuf, buflen, "%s|%s|%s|%s|%s\n%s",