[Xymon] Xymon client only reports once

Mark mark at carnildo.com
Wed Feb 15 08:16:33 CET 2012


On Tuesday 14 February 2012 11:59:13 am you wrote:
> On Mon, February 13, 2012 8:13 pm, Mark wrote:
> > I've installed Xymon on my home network as testing for a possible
> > installation
> > at work.  It's working fine on three out of the four systems, but on the
> > fourth, the client only reports its status to the server once,
> > immediately after being started.
> >
> > The problem computer is a Pentium MMX with 48MB RAM, running Gentoo
> > Linux.
> >
> > It looks as if the client is getting hung in the process of sending the
> > second
> > report: "ps aux" shows a sleeping "xymonlaunch" process, and the XYMONTMP
> > directory contains a "xymon_vmstat" file with a timestamp five minutes
> > after
> > the successful update.
> >
> > I could probably work around this with cron job to restart the client
> > every
> > five minutes, but I'd rather fix it properly.  Any suggestions on what
> > might
> > be going wrong, or other things I could look at?
> >
> > Thanks,
> > Mark Wagner
>
> The vmstat file there sounds normal... Can you run xymonlaunch with
> --debug and see what it's reporting back? Also, strace what it's doing
> when the next expected run occurs?
>
> For testing purposes you can bring the interval down to 30s or so. The
> only change you should notice is having multiple backgrounded vmstat
> processes going at once in a round-robin fashion.

Running xymonlaunch from the command line with the "--no-daemon" and "--debug" 
options, there's no output to the terminal.

clientlaunch.log:
2012-02-14 21:48:23 xymonlaunch starting
2012-02-14 21:48:23 Loading tasklist configuration from ./etc/clientlaunch.cfg
15337 2012-02-14 21:48:23 Opening file ./etc/clientlaunch.cfg
15337 2012-02-14 21:48:23 
15337 2012-02-14 21:48:23 Starting tasklist scan
15337 2012-02-14 21:48:23 About to start task client
15338 2012-02-14 21:48:23 client -> Loading environment 
from /home/xymon/client/etc/xymonclient.cfg area 
15338 2012-02-14 21:48:23 Opening file /home/xymon/client/etc/xymonclient.cfg
15338 2012-02-14 21:48:23 client -> Assigning stdout/stderr to 
log '/home/xymon/client/logs/xymonclient.log'
15337 2012-02-14 21:48:28 
15337 2012-02-14 21:48:28 Starting tasklist scan
15337 2012-02-14 21:48:28 Task client active with PID 15338
15337 2012-02-14 21:48:32 
15337 2012-02-14 21:48:32 Starting tasklist scan

The last two lines then repeat every five seconds until I kill the client.

xymonclient.log:
15338 2012-02-14 21:48:23 client -> 
Running '/home/xymon/client/bin/xymonclient.sh', XYMONHOME=/home/xymon/client

That one line is the only entry.

strace shows xymonclient forking off a new process (the "About to start task 
client" entry in clientlaunch.log).  The task client then execs 
xymonclient.sh, which gathers data, sends it off, and exits.  The main 
thread, meanwhile, has the following strace output repeating every five 
seconds with suitable changes to timestamps:

15337 21:56:03 wait4(-1, 0xbffff41c, WNOHANG, NULL) = -1 ECHILD (No child 
processes)
15337 21:56:03 time(NULL)               = 1329285363
15337 21:56:03 stat64("/etc/localtime", {st_mode=S_IFREG|0644, 
st_size=2819, ...}) = 0
15337 21:56:03 getpid()                 = 15337
15337 21:56:03 write(1, "15337 2012-02-14 21:56:03 \n", 27) = 27
15337 21:56:03 time(NULL)               = 1329285363
15337 21:56:03 stat64("/etc/localtime", {st_mode=S_IFREG|0644, 
st_size=2819, ...}) = 0
15337 21:56:03 getpid()                 = 15337
15337 21:56:03 write(1, "15337 2012-02-14 21:56:03 Starting tasklist scan\n", 
49) = 49
15337 21:56:03 time(NULL)               = 1329285363
15337 21:56:03 rt_sigprocmask(SIG_BLOCK, [CHLD], [RTMIN], 8) = 0
15337 21:56:03 rt_sigaction(SIGCHLD, NULL, {0x804a950, [], SA_RESTORER, 
0x4005d6f8}, 8) = 0
15337 21:56:03 rt_sigprocmask(SIG_SETMASK, [RTMIN], NULL, 8) = 0
15337 21:56:03 nanosleep({5, 0}, 0xbffff224) = 0

There's no change at the 30-second mark (when the client task should be 
gathering the next set of data), and the only action at the five-minute mark 
is vmstat waking up.

For comparison, running strace on a working system shows the main thread 
creating a new task client process right when it should.  One thing that may 
or may not be relevant: although the log output on both systems has the 
entry "Starting tasklist scan", the working client doesn't actually start 
stat()-ing "clientlaunch.cfg" until after the *second* successful run of the 
task client; the non-working system never does stat() it.

The strace logs from both machines are available if anyone thinks they might 
be useful in figuring out what's happening, but since they're about 2.5MB 
combined, I don't want to send them to the whole list.

-- 
Mark Wagner



More information about the Xymon mailing list