Use hobbit in operation center with critcal systems view
Gräub Roland
roland.graeub at rtc.ch
Mon Oct 15 09:29:53 CEST 2007
Hi all
We are planing a full change from our "production" Systemmanagment Tool to Hobbit. Now on most Systems are both clients installed and the plan is to use only hobbit.
In our environment the Operation Center always call when a alerts shows up on their Event Console and acknowledge the alert. With this action the alert is no longer visible for the operators.
At the moment the Operation Center didnt care about hobbit, but of course lots of other people in our organization use hobbit and are happy with this great tool.
With the critical systems view i think hobbit offers an ideal view for our Operation Center.
Now following questions/toughts came up when we look closer;
Acknowledge;
If an alert is acknowledge from the operators in critical systems this is a fix acknowledge for the given time, also when there is a status change.
When a problem is fixed and goes red/yellow again it will not shown up in critical view until the acked time is expired.
This sould be an option to ack a alert until a status change (like in disable until ok).
The option Host-ack seems to be broken, on my system only one Test is acknowledged although the Host-ack Checkbox is selected.
Log;
Missing a Log/Report from Critical view. A Report with information about the alerts and acknowledgeds information that were made in Critical systems would be helpful.
Definition (Edit Critical Systems);
Easiest way for us; made standard definitions and add host to this templates. Works fine.
But i miss a connection between alerts and critical view definition. Something like a option in hobbit-alerts.cfg to define that this rule is also valid for critical view.
Send a email when a alert shows up in critical view with all the possibiltys form hobbit-alerts.cfg.
Special Case missed or belated Messages by Operation Center;
Now some application/scripts sends Alerts to the Console View and the Operation Center make an alert call for each event.
A problem in Hobbit/BB is when changes happen in red messages, the Operation Center didnt realize that until the acknowledge time runs out and they make the alert call again.
This can happen for example in the disk status test (a second filesystem goes red) or with nested Tests/Logfiles. With the Event Console they get two messages (each for one Filesystem).
Is this anyway a topic ? How is that handled in your organisation ?
Regards,
Roland
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20071015/2c471524/attachment.html>
More information about the Xymon
mailing list