[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Tricky bug in Purple status determination
- To: <hobbit (at) hswn.dk>
- Subject: Tricky bug in Purple status determination
- From: "Samuel Cai" <Samuel.Cai (at) ehealth-china.com>
- Date: Mon, 15 Sep 2008 19:43:37 -0700
- Thread-index: AckXpgVIxYG7UPBmROGoSaWxOciD6g==
- Thread-topic: Tricky bug in Purple status determination
Hi,
Recently we found a weird problem in history of one monitoring, there
were a lot of purple status, and the duration was "none" or 1second. The
thing we were monitoring was running fine, and this problem was there
since we used Hobbit (more than half a year), so it rules out
possibility of error in that thing.
This monitoring is a script defined in hobbitlaunch.cfg on Hobbit
server, runs every 30m
I checked log, the purple status was updated by hobbitd, and then I
checked source code of hobbitd, found it checked purple status every 30m
(correct me if I'm wrong since I only know a little of C), so I guess
due to some program issues, there were some milliseconds differences
bettwen hobbitd's determination and script's update, that results in
very short duration of purple status.
So after I changed interval to 25m, that weird problem is gone.
Thanks,
Samuel Cai