[hobbit] Inexplicable purple on running services
Rob Munsch
rmunsch at solutionsforprogress.com
Tue Nov 1 20:21:22 CET 2005
Since ssh, ldap, and dns are tests run from the serverside (cpu etc
remaining green indicates the clients are running and communicating OK,
right?), i ran
./bbtest-net --concurrency=50 --checkresponse --no-update --timing --debug
Now, i can ping and ssh to all clients from server just fine. But i see
this:
---
2005-11-01 14:14:20 Adding to combo msg: status brassai.conn red <!--
[flags:ordAstILe] --> Tue Nov 1 14:14:20 2005 conn NOT ok
status brassai.conn red <!-- [flags:ordAstILe] --> Tue Nov 1 14:14:20
2005 conn NOT ok
Service conn on brassai is not OK : Host does not respond to ping
System unreachable for 3 poll periods (56 seconds)
---
Aha. Since the ping test fails, why test other net services? So now it
makes sense; the net tests are not being run, hence the purple.
a'course, i don't know why the nettest is suddenly unable to ping
anything. It is getting the right IPs internally:
---
2005-11-01 14:14:20 Got DNS result for host doisneau : 10.x.x.x
2005-11-01 14:14:20 Got DNS result for host brassai : 10.x.x.x
2005-11-01 14:14:20 Got DNS result for host moadib : 10.x.x.x
---
and i thought cranking the concurrency way down might help, but
apparently it doesn't.
So, i'm glad i found the cause... now i just need to find out the
cause's cause. o_O
Rob Munsch wrote:
> There's no entries in the network log since 10/28. Hobbit is running
> on the server, and the clients are running on the various clients.
>
> CPU, Memory, Disk and Procs all remain green!
> SSH, ldaps, and dns on the clients are purple.
>
> On the hobbit server itself, bbd is purple. Everything else is green.
> Network connectivity between all clients > server is functional.
>
> I don't get it...
>
> Henrik Stoerner wrote:
>
>> On Mon, Oct 31, 2005 at 05:32:44PM -0500, Rob Munsch wrote:
>>
>>
>>> Consider the below. Approx. 25 minutes ago, across all monitored
>>> systems, all net monitored services - ssh, ldaps and dns - went to
>>> purple. They are still up, running, and just fine in every
>>> respect. The status message is even the same as when it was showing
>>> green. But now every ssh, ldaps and dns light is purple.
>>>
>>
>>
>> Purple is an indication that some part of your monitoring system
>> has stopped.
>>
>> All of the purple ones are network services ? Then it sounds as if
>> your network tests have stopped running. Check the
>> ~hobbit/server/logs/bb-network.log file for any errors.
>>
>>
>> Regards,
>> Henrik
>>
>>
>> To unsubscribe from the hobbit list, send an e-mail to
>> hobbit-unsubscribe at hswn.dk
>>
>>
>>
>>
>
>
--
Rob Munsch
Systems Analyst, Solutions for Progress
http://www.solutionsforprogress.com
More information about the Xymon
mailing list