[Xymon] Xymon client only reports once
Mark
mark at carnildo.com
Wed Feb 15 08:16:33 CET 2012
On Tuesday 14 February 2012 11:59:13 am you wrote:
> On Mon, February 13, 2012 8:13 pm, Mark wrote:
> > I've installed Xymon on my home network as testing for a possible
> > installation
> > at work. It's working fine on three out of the four systems, but on the
> > fourth, the client only reports its status to the server once,
> > immediately after being started.
> >
> > The problem computer is a Pentium MMX with 48MB RAM, running Gentoo
> > Linux.
> >
> > It looks as if the client is getting hung in the process of sending the
> > second
> > report: "ps aux" shows a sleeping "xymonlaunch" process, and the XYMONTMP
> > directory contains a "xymon_vmstat" file with a timestamp five minutes
> > after
> > the successful update.
> >
> > I could probably work around this with cron job to restart the client
> > every
> > five minutes, but I'd rather fix it properly. Any suggestions on what
> > might
> > be going wrong, or other things I could look at?
> >
> > Thanks,
> > Mark Wagner
>
> The vmstat file there sounds normal... Can you run xymonlaunch with
> --debug and see what it's reporting back? Also, strace what it's doing
> when the next expected run occurs?
>
> For testing purposes you can bring the interval down to 30s or so. The
> only change you should notice is having multiple backgrounded vmstat
> processes going at once in a round-robin fashion.
Running xymonlaunch from the command line with the "--no-daemon" and "--debug"
options, there's no output to the terminal.
clientlaunch.log:
2012-02-14 21:48:23 xymonlaunch starting
2012-02-14 21:48:23 Loading tasklist configuration from ./etc/clientlaunch.cfg
15337 2012-02-14 21:48:23 Opening file ./etc/clientlaunch.cfg
15337 2012-02-14 21:48:23
15337 2012-02-14 21:48:23 Starting tasklist scan
15337 2012-02-14 21:48:23 About to start task client
15338 2012-02-14 21:48:23 client -> Loading environment
from /home/xymon/client/etc/xymonclient.cfg area
15338 2012-02-14 21:48:23 Opening file /home/xymon/client/etc/xymonclient.cfg
15338 2012-02-14 21:48:23 client -> Assigning stdout/stderr to
log '/home/xymon/client/logs/xymonclient.log'
15337 2012-02-14 21:48:28
15337 2012-02-14 21:48:28 Starting tasklist scan
15337 2012-02-14 21:48:28 Task client active with PID 15338
15337 2012-02-14 21:48:32
15337 2012-02-14 21:48:32 Starting tasklist scan
The last two lines then repeat every five seconds until I kill the client.
xymonclient.log:
15338 2012-02-14 21:48:23 client ->
Running '/home/xymon/client/bin/xymonclient.sh', XYMONHOME=/home/xymon/client
That one line is the only entry.
strace shows xymonclient forking off a new process (the "About to start task
client" entry in clientlaunch.log). The task client then execs
xymonclient.sh, which gathers data, sends it off, and exits. The main
thread, meanwhile, has the following strace output repeating every five
seconds with suitable changes to timestamps:
15337 21:56:03 wait4(-1, 0xbffff41c, WNOHANG, NULL) = -1 ECHILD (No child
processes)
15337 21:56:03 time(NULL) = 1329285363
15337 21:56:03 stat64("/etc/localtime", {st_mode=S_IFREG|0644,
st_size=2819, ...}) = 0
15337 21:56:03 getpid() = 15337
15337 21:56:03 write(1, "15337 2012-02-14 21:56:03 \n", 27) = 27
15337 21:56:03 time(NULL) = 1329285363
15337 21:56:03 stat64("/etc/localtime", {st_mode=S_IFREG|0644,
st_size=2819, ...}) = 0
15337 21:56:03 getpid() = 15337
15337 21:56:03 write(1, "15337 2012-02-14 21:56:03 Starting tasklist scan\n",
49) = 49
15337 21:56:03 time(NULL) = 1329285363
15337 21:56:03 rt_sigprocmask(SIG_BLOCK, [CHLD], [RTMIN], 8) = 0
15337 21:56:03 rt_sigaction(SIGCHLD, NULL, {0x804a950, [], SA_RESTORER,
0x4005d6f8}, 8) = 0
15337 21:56:03 rt_sigprocmask(SIG_SETMASK, [RTMIN], NULL, 8) = 0
15337 21:56:03 nanosleep({5, 0}, 0xbffff224) = 0
There's no change at the 30-second mark (when the client task should be
gathering the next set of data), and the only action at the five-minute mark
is vmstat waking up.
For comparison, running strace on a working system shows the main thread
creating a new task client process right when it should. One thing that may
or may not be relevant: although the log output on both systems has the
entry "Starting tasklist scan", the working client doesn't actually start
stat()-ing "clientlaunch.cfg" until after the *second* successful run of the
task client; the non-working system never does stat() it.
The strace logs from both machines are available if anyone thinks they might
be useful in figuring out what's happening, but since they're about 2.5MB
combined, I don't want to send them to the whole list.
--
Mark Wagner
More information about the Xymon
mailing list