[hobbit] Hobbit 4.0.4 released - Alert Script Issue
Peter Welter
peter.welter at gmail.com
Sat Aug 27 12:10:08 CEST 2005
Hello Henrik,
> two things we can do:
>
> 1) Add "--checkpoint-file=$BBTMP/alert.chk --checkpoint-interval=600" to
> the hobbitd_alert command in hobbitlaunch.cfg. That way it will
> remember all active alerts when you restart Hobbit.
I'll do that asap (coming monday). That will certainly resolve this issue.
> 2) When a new alert was first seen (also after a restart of Hobbit), the
> duration was reset to 0 - instead of using the information Hobbit
> already had about when the status change occurred. I've changed this
> in the code, so that it picks up the duration of the alert from the
> timestamp we keep for when the last status change happened.
Ok, but that usefull addition is for new/coming releases.
However, I think I found out why the entire problem showed up in the
first place. I had a alert-config that first mailed on an occuring
event and if that was not dealt with properly, ran a pager script 20
minutes later. After an evening of applying (OS-)patches, a reboot
etc. it did not work anymore. Eventually I thought that it had to do
with a alert-config modification, resulting in this
email-conversation.
As suggested, I checked the alerttrace.log, but could not find a
reason why this problem happened (I changed pagerscript to mail, but
no result). It *does* worked fine when *all* the alerts are processed
at the same time!
Exploring the mailinglist and Changes-file for each version, I think
it can be brought down to a known bug in Hobbit that is to be fixed in
4.1.2; see my mail from August 19th, 11:42.
Since we are running 4.0.4, I'm thinking what is a wise thing to do?
The workaround does work fine now (we are a 24*7 University), I
thinking to wait untill 4.1.2 reaches the Stable status, since 4.1.1
does not solve this particular bug.
Regards, Peter
More information about the Xymon
mailing list