[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
sorry for the constant revision (was: re: purple haze)
Last email for a while, i promise; i'm chainsmoking packets at this
point. but i found this-
---
2005-11-01 14:14:20 TCP tests completed normally
2005-11-01 14:14:20 Execution of 'fping -Ae' failed with error-code 99
2005-11-01 14:14:20 Sending results for service conn
---
Okay, it can't find fping. But...
---
hobbit (at) randomaccess ~/server/bin $ more ../etc/hobbitserver.cfg |grep fping
# Make sure the path includes the directories where you have fping, mail
and (optionally) ntpdate installed,
FPING="/usr/sbin/fping" # Path and
options for the 'fping' program.
hobbit (at) randomaccess ~/server/bin $ /usr/sbin/fping -Ae brassai
10.10.10.15 is alive (0.15 ms)
hobbit (at) randomaccess ~/server/bin $
---
So it should be finding fping just fine, and fping is working.
The path is in hobbitserver.cfg:
---
# Make sure the path includes the directories where you have fping, mail
and (optionally) ntpdate installed,
# as well as the BBHOME/bin directory where all of the Hobbit programs
reside.
PATH="/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin:/home/hobbit/server/bin"
...
# For bbtest-net
...
FPING="/usr/sbin/fping"
# Path and options for the 'fping' program.
---
and
[bbnet]
ENVFILE /home/hobbit/server/etc/hobbitserver.cfg
------------
So, by all the above: fping is functional, it is accessible by the
'hobbit' user, it can reach the clients, it is in the PATH, it is
defined in the ENVFILE bbnet is using.
So what's gone wrong??
Rob Munsch wrote:
Since ssh, ldap, and dns are tests run from the serverside (cpu etc
remaining green indicates the clients are running and communicating
OK, right?), i ran
./bbtest-net --concurrency=50 --checkresponse --no-update --timing
--debug
Now, i can ping and ssh to all clients from server just fine. But i
see this:
---
2005-11-01 14:14:20 Adding to combo msg: status brassai.conn red <!--
[flags:ordAstILe] --> Tue Nov 1 14:14:20 2005 conn NOT ok
status brassai.conn red <!-- [flags:ordAstILe] --> Tue Nov 1 14:14:20
2005 conn NOT ok
Service conn on brassai is not OK : Host does not respond to ping
System unreachable for 3 poll periods (56 seconds)
---
Aha. Since the ping test fails, why test other net services? So now
it makes sense; the net tests are not being run, hence the purple.
a'course, i don't know why the nettest is suddenly unable to ping
anything. It is getting the right IPs internally:
---
2005-11-01 14:14:20 Got DNS result for host doisneau : 10.x.x.x
2005-11-01 14:14:20 Got DNS result for host brassai : 10.x.x.x
2005-11-01 14:14:20 Got DNS result for host moadib : 10.x.x.x
---
and i thought cranking the concurrency way down might help, but
apparently it doesn't.
So, i'm glad i found the cause... now i just need to find out the
cause's cause. o_O
--
Rob Munsch
Systems Analyst, Solutions for Progress
http://www.solutionsforprogress.com