[hobbit] Test goes purple randomly

Samuel Cai Samuel.Cai at ehealth-china.com
Fri Oct 24 03:47:51 CEST 2008


Found history email:

From: Ralph Mitchell [mailto:ralphmitchell at gmail.com] 
Sent: Tuesday, September 16, 2008 12:44 PM
To: hobbit at hswn.dk
Subject: Re: [hobbit] Tricky bug in Purple status determination

 

When a report comes in to Hobbit, the default "time to live" for the
report is 30 mins.  As long as another report comes in within that time,
the timer is reset.  If there's no report, that column goes purple.

If your test is reporting every 30 mins, there's a good chance it'll
exhibit the behaviour you describe.

What you should do is alter the test script to use the "status+LIFETIME"
format, where LIFETIME is the life span of the report, as described in
the bb man page, and make the lifetime a bit longer than the the test
interval.

Ralph Mitchell



On Mon, Sep 15, 2008 at 9:43 PM, Samuel Cai
<Samuel.Cai at ehealth-china.com> wrote:

Hi,

 

Recently we found a weird problem in history of one monitoring, there
were a lot of purple status, and the duration was "none" or 1second. The
thing we were monitoring was running fine, and this problem was there
since we used Hobbit (more than half a year), so it rules out
possibility of error in that thing.

This monitoring is a script defined in hobbitlaunch.cfg on Hobbit
server, runs every 30m

I checked log, the purple status was updated by hobbitd, and then I
checked source code of hobbitd, found it checked purple status every 30m
(correct me if I'm wrong since I only know a little of C), so I guess
due to some program issues, there were some milliseconds differences
bettwen hobbitd's determination and script's update, that results in
very short duration of purple status.

 

So after I changed interval to 25m, that weird problem is gone.

 

Thanks,

Samuel Cai


-----Original Message-----
From: Wayne Gemmell [mailto:wayne at flashmedia.co.za] 
Sent: Thursday, October 23, 2008 3:34 PM
To: hobbit at hswn.dk
Subject: [hobbit] Test goes purple randomly

Hiya

I have got a custom script that goes purple randomly for less than 5
seconds. 
Could this be because hobbit is not getting a response in the interval
it is 
expecting a response (in this case 30 min). I have a few time-consuming 
custom scripts that run and my suspicion is that they don't all complete
in 
an allotted time so hobbit assumes there is no response. Any input on
this?

This is what the html log says.

Date 					Status 	Duration
Wed Oct 22 21:28:37 2008 	green 	11:57:35
Wed Oct 22 21:28:37 2008 	purple 	none



-- 
Regards
Wayne 

To unsubscribe from the hobbit list, send an e-mail to
hobbit-unsubscribe at hswn.dk





More information about the Xymon mailing list