[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [hobbit] Alert Rules - DURATION not working



Henrik,

Thank you so much for replying. I caused a yellow alarm for procs on host rsoimpm1, I am expecting the rule to fire after 15 minutes. Here is what I see from the log file in more detail:

005-02-01 15:17:29 hobbitd_alert: Got message 37 @@page#37|1107271049.602362|166.34.57.23
9|rsoimpm1|procs|166.34.57.239|1107272849|yellow|green|1107271049|CAY/pmservers|947420
2005-02-01 15:17:29 Got page message from rsoimpm1:procs
2005-02-01 15:17:29 Alert status changed from 0 to 1
2005-02-01 15:17:29 criteriamatch rsoimpm1:procs %.*:(NULL):(NULL)
2005-02-01 15:17:29 pcre_exec returned 1
2005-02-01 15:17:29 Checking explicit color setting 10000000020 against 4 gives 1
2005-02-01 15:17:29 Found a first matching rule
2005-02-01 15:17:29 criteriamatch rsoimpm1:procs (NULL):(NULL):(NULL)
2005-02-01 15:17:29 event start: 1107271049, failed minduration 0<900
2005-02-01 15:17:29 criteriamatch rsoimpm1:procs (NULL):(NULL):(NULL)
2005-02-01 15:17:29 event start: 1107271049, failed minduration 0<39225600
2005-02-01 15:17:29 criteriamatch rsoimpm1:procs (NULL):(NULL):(NULL)
2005-02-01 15:17:29 Checking explicit color setting 10000000040 against 4 gives 0
2005-02-01 15:17:29 No more secondary matching rule
2005-02-01 15:17:29 1 alerts to go
2005-02-01 15:17:29 Compiling regex .*
2005-02-01 15:17:29 criteriamatch rsoimpm1:procs %.*:(NULL):(NULL)
2005-02-01 15:17:29 pcre_exec returned 1
2005-02-01 15:17:29 Checking explicit color setting 10000000020 against 4 gives 1
2005-02-01 15:17:29 Found a first matching rule
2005-02-01 15:17:29 criteriamatch rsoimpm1:procs (NULL):(NULL):(NULL)
2005-02-01 15:17:29 event start: 1107271049, failed minduration 0<900
2005-02-01 15:17:29 criteriamatch rsoimpm1:procs (NULL):(NULL):(NULL)
2005-02-01 15:17:29 event start: 1107271049, failed minduration 0<39225600
2005-02-01 15:17:29 criteriamatch rsoimpm1:procs (NULL):(NULL):(NULL)
2005-02-01 15:17:29 send_alert rsoimpm1:procs state 0
2005-02-01 15:17:29 Checking explicit color setting 10000000040 against 4 gives 0
2005-02-01 15:17:29 criteriamatch rsoimpm1:procs %.*:(NULL):(NULL)
2005-02-01 15:17:29 No more secondary matching rule
2005-02-01 15:17:29 pcre_exec returned 1
2005-02-01 15:17:29 Checking explicit color setting 10000000020 against 4 gives 1
2005-02-01 15:17:29 Found a first matching rule
2005-02-01 15:17:29 criteriamatch rsoimpm1:procs (NULL):(NULL):(NULL)
2005-02-01 15:17:29 event start: 1107271049, failed minduration 0<900
2005-02-01 15:17:29 criteriamatch rsoimpm1:procs (NULL):(NULL):(NULL)
2005-02-01 15:17:29 event start: 1107271049, failed minduration 0<39225600
2005-02-01 15:17:29 criteriamatch rsoimpm1:procs (NULL):(NULL):(NULL)
2005-02-01 15:17:29 Checking explicit color setting 10000000040 against 4 gives 0
2005-02-01 15:17:29 No more secondary matching rule


I caused a yellow alarm at 15:17, so far OK. Alert status changed, criteria match, regex match, color match, found rule, checking minduration, which fails, not less than 15 minutes. Sorry, I did add to the debug print statement in the source code.

2005-02-01 15:22:29 hobbitd_alert: Got message 58 @@page#58|1107271349.301483|166.34.57.23
9|rsoimpm1|procs|166.34.57.239|1107273149|yellow|yellow|1107271049|CAY/pmservers|947420
2005-02-01 15:22:29 Got page message from rsoimpm1:procs
2005-02-01 15:22:29 0 alerts to go


2005-02-01 15:27:29 hobbitd_alert: Got message 79 @@page#79|1107271649.155212|166.34.57.23
9|rsoimpm1|procs|166.34.57.239|1107273449|yellow|yellow|1107271049|CAY/pmservers|947420
2005-02-01 15:27:29 Got page message from rsoimpm1:procs
2005-02-01 15:27:29 0 alerts to go


2005-02-01 15:32:28 hobbitd_alert: Got message 101 @@page#101|1107271948.980583|166.34.57.
239|rsoimpm1|procs|166.34.57.239|1107273748|yellow|yellow|1107271049|CAY/pmservers|947420
2005-02-01 15:32:28 Got page message from rsoimpm1:procs
2005-02-01 15:32:28 0 alerts to go


2005-02-01 15:37:28 hobbitd_alert: Got message 123 @@page#123|1107272248.884069|166.34.57.
239|rsoimpm1|procs|166.34.57.239|1107274048|yellow|yellow|1107271049|CAY/pmservers|947420
2005-02-01 15:37:28 Got page message from rsoimpm1:procs
2005-02-01 15:37:28 0 alerts to go


So it's like nothing happens afterwards? Hopefully, I got all the relevant parts of the log file. I didn't want the posting to long. Any ideas?


~David Gore


Henrik Stoerner wrote:
On Tue, Feb 01, 2005 at 01:02:58AM +0000, David Gore wrote:

As you can see from the out put below a DURATION of '15m' translates to 653760.


I'll look into that


Either we have something configured wrong or DURATION is broken?


HOST=% COLOR=yellow
       MAIL somebody (at) somehost.com REPEAT=8h DURATION>15
       MAIL anybody (at) anyhost.com REPEAT=8h DURATION>15m


"HOST=%" is definitely wrong. "HOST=%.*" is what you want.


Henrik

To unsubscribe from the hobbit list, send an e-mail to
hobbit-unsubscribe (at) hswn.dk