[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [hobbit] Tricky bug in Purple status determination
- To: hobbit (at) hswn.dk
- Subject: Re: [hobbit] Tricky bug in Purple status determination
- From: "Ralph Mitchell" <ralphmitchell (at) gmail.com>
- Date: Mon, 15 Sep 2008 23:43:45 -0500
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type:references; bh=a3hsOeV4FVtTO8oqg6CxkX2Jqq+hUct7HfejuxdgjP0=; b=qA24970RL6CRzjOjYINAXwzPUvdhXJeEvscfx9lCCkQ5KZu0LDuS/NXGhaynpexvPs RAKrliJGMMsrV8rH+wlBrFpmdPFtAKf+VAD7+X1+XldnKui+W1Z+wCwRPeOOxe/I65ST 4HYLNespq9z5vQ5K2b+XIUH/fGffVZXqOr7Xg=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:references; b=ZKvPwu8aSCNhdhz2MmO8VGlcib6OSP35rkvUtcxQHpeKxBPtO/mzTN7TRTKqyTwXzN 47tvbOA/jY6YN3VCwdQs07E21sV5JKyMN1BF+4x2iSJWFPkTT+RYZ6d6B8xpylymk2R4 xQkputHrjMD0D0xoYqaw143QoR6kIMFR1OT8A=
- References: <980B46CCEFAE3F4A836BBF54E4DBCB160C6D158E (at) SJEXVS01.ehi.ehealth.com>
When a report comes in to Hobbit, the default "time to live" for the report
is 30 mins. As long as another report comes in within that time, the timer
is reset. If there's no report, that column goes purple.
If your test is reporting every 30 mins, there's a good chance it'll exhibit
the behaviour you describe.
What you should do is alter the test script to use the "status+LIFETIME"
format, where LIFETIME is the life span of the report, as described in the
bb man page, and make the lifetime a bit longer than the the test interval.
Ralph Mitchell
On Mon, Sep 15, 2008 at 9:43 PM, Samuel Cai <Samuel.Cai (at) ehealth-china.com>wrote:
> Hi,
>
>
>
> Recently we found a weird problem in history of one monitoring, there were
> a lot of purple status, and the duration was "none" or 1second. The thing we
> were monitoring was running fine, and this problem was there since we used
> Hobbit (more than half a year), so it rules out possibility of error in that
> thing.
>
> This monitoring is a script defined in hobbitlaunch.cfg on Hobbit server,
> runs every 30m
>
> I checked log, the purple status was updated by hobbitd, and then I checked
> source code of hobbitd, found it checked purple status every 30m (correct me
> if I'm wrong since I only know a little of C), so I guess due to some
> program issues, there were some milliseconds differences bettwen hobbitd's
> determination and script's update, that results in very short duration of
> purple status.
>
>
>
> So after I changed interval to 25m, that weird problem is gone.
>
>
>
> Thanks,
>
> Samuel Cai
>