[hobbit] Inexplicable purple on running services

Rob Munsch rmunsch at solutionsforprogress.com
Tue Nov 1 20:21:22 CET 2005


Since ssh, ldap, and dns are tests run from the serverside (cpu etc 
remaining green indicates the clients are running and communicating OK, 
right?), i ran

./bbtest-net --concurrency=50 --checkresponse --no-update --timing --debug

Now, i can ping and ssh to all clients from server just fine.  But i see 
this:

---
2005-11-01 14:14:20 Adding to combo msg: status brassai.conn red <!-- 
[flags:ordAstILe] --> Tue Nov  1 14:14:20 2005 conn NOT ok
status brassai.conn red <!-- [flags:ordAstILe] --> Tue Nov  1 14:14:20 
2005 conn NOT ok

Service conn on brassai is not OK : Host does not respond to ping

System unreachable for 3 poll periods (56 seconds)
---

Aha.  Since the ping test fails, why test other net services?  So now it 
makes sense; the net tests are not being run, hence the purple.

a'course, i don't know why the nettest is suddenly unable to ping 
anything.  It is getting the right IPs internally:

---
2005-11-01 14:14:20 Got DNS result for host doisneau : 10.x.x.x
2005-11-01 14:14:20 Got DNS result for host brassai : 10.x.x.x
2005-11-01 14:14:20 Got DNS result for host moadib : 10.x.x.x
---

and i thought cranking the concurrency way down might help, but 
apparently it doesn't.

So, i'm glad i found the cause... now i just need to find out the 
cause's cause.  o_O

Rob Munsch wrote:

> There's no entries in the network log since 10/28.  Hobbit is running 
> on the server, and the clients are running on the various clients.
>
> CPU, Memory, Disk and Procs all remain green!
> SSH, ldaps, and dns on the clients are purple.
>
> On the hobbit server itself, bbd is purple.  Everything else is green.
> Network connectivity between all clients > server is functional.
>
> I don't get it...
>
> Henrik Stoerner wrote:
>
>> On Mon, Oct 31, 2005 at 05:32:44PM -0500, Rob Munsch wrote:
>>  
>>
>>> Consider the below.  Approx. 25 minutes ago, across all monitored 
>>> systems, all net monitored services - ssh, ldaps and dns - went to 
>>> purple.  They are still up, running, and just fine in every 
>>> respect.  The status message is even the same as when it was showing 
>>> green.  But now every ssh, ldaps and dns light is purple.
>>>   
>>
>>
>> Purple is an indication that some part of your monitoring system
>> has stopped.
>>
>> All of the purple ones are network services ? Then it sounds as if
>> your network tests have stopped running. Check the
>> ~hobbit/server/logs/bb-network.log file for any errors.
>>
>>
>> Regards,
>> Henrik
>>
>>
>> To unsubscribe from the hobbit list, send an e-mail to
>> hobbit-unsubscribe at hswn.dk
>>
>>
>>  
>>
>
>


-- 
Rob Munsch
Systems Analyst, Solutions for Progress
http://www.solutionsforprogress.com




More information about the Xymon mailing list