[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

hobbitmon 4.2.3 client on esx 3.5.0 works first time only - then goes purple



Hi,
          I have hobbitmon clients running on various servers; this week turned my attentions to some ESX servers. The hobbitmon client configure/make went fine & after running "runclient.sh start" the status columns of the hobbitmon server display page went green (and expected data reported). However, 20 - 30 minutes later, those statuses went purple & inspecting the data showed (from the timestamp) that the client ran only the first time.
          The ...client/logs show nothing untoward, no core files in ...client/tmp - and I opened the ESX firewall port!

Client start-up

[xymon (at) esx112 logs]$ ../runclient.sh start
Hobbit client for linux started on esx112
[xymon (at) esx112 logs]$ date
Wed Jul 22 11:32:04 BST 2009

Initial data reported (green smiley's) (server: http://W.X.Y.Z/xymon-cgi/bb-hostsvc.sh?HOST=esx112&SERVICE=cpu<http://w.x.y.z/xymon-cgi/bb-hostsvc.sh?HOST=esx112&SERVICE=cpu> )


Top of Form
Bottom of Form
Wed Jul 22 11:31:56 BST 2009 up: 113 days, 2 users, 72 procs, load=0.02
System clock is -220 seconds off


However, a little over 30 minutes later those smiley's went purple & the data timestamp was unchanged from the above.

The logs show little

log/hobbitclient.log # no entries even for the start-up

log/clientlaunch.log

2009-07-22 11:31:56 hobbitlaunch starting
2009-07-22 11:31:56 Loading tasklist configuration from /home/xymon/client/etc/clientlaunch.cfg
2009-07-22 11:31:56 Task client started with PID 31241

And there is nothing in: /var/log/messages

And no core files:

[root (at) esx112 xymon]# pwd
/home/xymon
[root (at) esx112 xymon]# find . -name "*core*"
[root (at) esx112 xymon]#

Looking at xymon processes I notice that there is a difference between a working linux client and this failed esx client.

Working SLES10-SP2 client:

xymon     5155     1  0 May28 ?        00:00:02 /home/xymon/client/bin/hobbitlau
nch --config=/home/xymon/client/etc/clientlaunch.cfg --log=/home/xymon/client/lo
gs/clientlaunch.log --pidfile=/home/xymon/client/logs/clientlaunch.sles10-SP2-do
mU-1.pid
xymon    18347     1  0 12:11 ?        00:00:00 sh -c vmstat 300 2 1>/home/xymon
/client/tmp/hobbit_vmstat.sles10-SP2-domU-1.18313 2>&1; mv /home/xymon/client/tm
p/hobbit_vmstat.sles10-SP2-domU-1.18313 /home/xymon/client/tmp/hobbit_vmstat.sle
s10-SP2-domU-1
xymon    18349 18347  0 12:11 ?        00:00:00 vmstat 300 2

Failed esx 3.5.0 client:

[root (at) esx112 xymon]# ps -ef|grep "^xymon"
xymon    30803 30801  0 09:38 ?        00:00:00 sshd: xymon (at) pts/0
xymon    30804 30803  0 09:38 pts/0    00:00:00 -bash
xymon    31240     1  0 11:31 ?        00:00:00 /home/xymon/client/bin/hobbitlaunch --verbose --config=/home/xymon/client/etc/clientlaunch.cfg --log=/home/xymon/client/logs/clientlaunch.log --pidfile=/home/xymon/client/logs/clientlaunch.esx112.pid

So vmstat seems to be a clue... so I ran hobbitclient-linux.sh - no errors reported in the output & a "0" return code.

Any ideas where I go next?

TIA & regards
steve overy  | support analyst | EMEA Client Support Centre | Global Outsourcing and Infrastructure Services
Unisys Limited  |  Fox Milne  |  Tongwell Street, Milton Keynes, MK15 0YS
Registered in England Company No. 103709
Registered Office: Bakers Court, Bakers Road, Uxbridge, UB8 1RG
Phone: +44 (0) 1908 212306, net 741 2306
Mobile: +44 (0) 7808 391673, net 839 1673
Email: steve.overy (at) unisys.com
THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.