[Xymon] Fwd: Green status
Steven Carr
sjcarr at gmail.com
Tue Aug 14 20:33:15 CEST 2012
So you could also flag the nodes with the "dialup" flag which will allow
the nodes to go down and Xymon wont complain that they are down, but you
are still going to have to write your own server side script which
determine if a host is down that shouldn't be down and then raise an alert
for it. Xymon can't do that out of the box, and I'm not sure if any other
monitoring systems can either, the majority of monitoring solutions expect
nodes to be up 100% and only have exceptions for scheduled downtime etc.
Steve
On 14 August 2012 19:13, pankaj dorlikar <pankaj.dorlikar at gmail.com> wrote:
> hi,
>
> thanks for reply. We have clients installed on all the nodes. At any
> point of time, the nodes on which job is not running will be powered
> down. if new job comes, these nodes be powered up and some other
> nodes will go down which are not running any job.
>
>
> On 8/14/12, Steven Carr <sjcarr at gmail.com> wrote:
> > How are you monitoring the nodes? do you have a xymon client on each of
> the
> > nodes or are you doing a simple "ping" check to the node?
> >
> > If you are just doing a simple ping check then, off the top of my head, I
> > would make all nodes "noconn" in the hosts.cfg so Xymon doesn't actually
> > ping them anymore and write a script which uses the data you have to ping
> > nodes and then work out if the node should be up or not, and if the node
> is
> > down and it shouldn't be then trigger a red alarm for that node.
> >
> > Steve
> >
> >
> >
> > On 14 August 2012 12:05, pankaj dorlikar <pankaj.dorlikar at gmail.com>
> wrote:
> >
> >> ---------- Forwarded message ----------
> >> From: pankaj dorlikar <pankaj.dorlikar at gmail.com>
> >> Date: Tue, 14 Aug 2012 16:34:00 +0530
> >> Subject: Re: [Xymon] Green status
> >> To: Ryan Novosielski <novosirj at umdnj.edu>
> >>
> >> Hi,
> >>
> >> thank you for reply.
> >> But at any point of time, only some of the nodes will be down and all
> >> the other nodes will be up. If the server itself goes down, the
> >> monitoring of rest of the working nodes will be affected.
> >>
> >> On 8/14/12, Ryan Novosielski <novosirj at umdnj.edu> wrote:
> >> > -----BEGIN PGP SIGNED MESSAGE-----
> >> > Hash: SHA1
> >> >
> >> > What he is saying is that if there is an event that takes place where
> >> > you can execute a script at the time it happens, you can disable the
> >> > server by using the main binary's "disable" function. This binary used
> >> > to be called "bb" but is now called "xymon" -- take a look at its man
> >> > page to see how to send a disable message.
> >> >
> >> > On 08/14/2012 03:55 AM, pankaj dorlikar wrote:
> >> >> Hi,
> >> >>
> >> >> Thank you for proving pointers and important clues. 1) Query
> >> >> regarding "server-side test" : We can know the status of the "down"
> >> >> nodes which are down as per schedular's instructions. But how this
> >> >> information will help in setting the blue/green color for those
> >> >> nodes in xymon web page? I mean how to send this data to xymon
> >> >> server? Also will it cover all the tests?
> >> >>
> >> >> 2) How client cas send to send a "disable" command to server?
> >> >>
> >> >> thank you
> >> >>
> >> >> -pankaj
> >> >>
> >> >>
> >> >> On 8/14/12, cleaver at terabithia.org <cleaver at terabithia.org> wrote:
> >> >>>> We are using xymon-4.2.2 on rhel 5.2 server and more than 200
> >> >>>> clients (HPC Cluster nodes).
> >> >>>>
> >> >>>> Our requirement is :
> >> >>>>
> >> >>>> -> If the node is powered down by scheduler for saving the
> >> >>>> power, it is required that xymon should show its state as green
> >> >>>> and same for other tests of same node.
> >> >>>>
> >> >>>> Nodes powered down by scheduler are identified by pbsnodes
> >> >>>> command which will show state as power.
> >> >>>>
> >> >>>> -> If the node is going down by some other reason other that
> >> >>>> powering down by scheduler, it should show red like normal
> >> >>>> clients.
> >> >>>>
> >> >>>
> >> >>>
> >> >>> Assuming your scheduler can have shell script hooks attached to
> >> >>> events, I'd add something to send a "disable" command before it
> >> >>> brings a node down, and then re-enable as it comes back up. If
> >> >>> the nodes are being powered down without state being saved (eg,
> >> >>> not suspending/resuming themselves), then just disable "until
> >> >>> OK", otherwise I'd use some arbitrary future value.
> >> >>>
> >> >>> Relevant tests will be blue (not green, as requested), but that
> >> >>> will be handled as a non-event for SLA purposes.
> >> >>>
> >> >>> Separately, it might be a good idea to have a separate
> >> >>> server-side test that sends node state about each node to xymon
> >> >>> independent of the node itself. That test is a fine place to put
> >> >>> logic as well.
> >> >>>
> >> >>>
> >> >>> HTH,
> >> >>>
> >> >>> -jc
> >> >>>
> >> >>>
> >> >>
> >> >>
> >> >
> >> >
> >> > - --
> >> > - ---- _ _ _ _ ___ _ _ _
> >> > |Y#| | | |\/| | \ |\ | | |Ryan Novosielski - Sr. Systems Programmer
> >> > |$&| |__| | | |__/ | \| _| |novosirj at umdnj.edu - 973/972.0922(2-0922)
> >> > \__/ Univ. of Med. and Dent.|IST/EI-Academic Svcs. - ADMC 450, Newark
> >> > -----BEGIN PGP SIGNATURE-----
> >> > Version: GnuPG v1.4.11 (GNU/Linux)
> >> > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
> >> >
> >> > iEYEARECAAYFAlAqFsEACgkQmb+gadEcsb53xACfVP9x3ThR0zKtrYFVfVhHzJoI
> >> > JNQAoLUaRTt3AcQmrhoArknmclS7WkPw
> >> > =jBNe
> >> > -----END PGP SIGNATURE-----
> >> >
> >>
> >>
> >> --
> >> Pankaj V. Dorlikar
> >>
> >>
> >>
> >> --
> >> Pankaj V. Dorlikar
> >> _______________________________________________
> >> Xymon mailing list
> >> Xymon at xymon.com
> >> http://lists.xymon.com/mailman/listinfo/xymon
> >>
> >
>
>
> --
> Pankaj V. Dorlikar
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20120814/4fc0afa6/attachment.html>
More information about the Xymon
mailing list