[hobbit] stale alerts

Gore, David W (David) david.gore at verizonbusiness.com
Thu Nov 15 13:46:32 CET 2007


I am running 4.2.0 with the allinone patch.  I do have a suspicion that
external server side scripts could contribute to the stale alerts.  In
my case, specifically a script that ssh's to a remote host and executes
a shell script on the remote host.  It may also happen in conjunction
with these errors in the logs, which someone else recently reported:
 
page.log.1:2007-11-05 16:20:53 hobbitd_alert: Got message 14425,
expected 14424

As soon as the issue with tooltips is resolved, I will be moving our
primary hobbit to the snapshot release.
 
~David


________________________________

	From: Josh Luthman [mailto:josh at imaginenetworksllc.com] 
	Sent: Wednesday, November 14, 2007 17:18
	To: hobbit at hswn.dk
	Subject: Re: [hobbit] stale alerts
	
	
	This has never happened to me - are the two of you using the
4.2.0 release?
	
	Josh
	 
	 
	
	On 11/14/07, Gore, David W (David) <
david.gore at verizonbusiness.com <mailto:david.gore at verizonbusiness.com> >
wrote: 


		> -----Original Message-----
		> From: Gary Baluha [mailto: gumby3203 at gmail.com]
		> Sent: Wednesday, November 14, 2007 16:30
		> To: hobbit at hswn.dk
		> Subject: Re: [hobbit] stale alerts
		> 
		> Order of events:
		> 1am: alert went yellow, email was sent out
		> 1:15am: alert recovered
		> 2am, and each additional hour: email was sent out
saying
		> alert was yellow (it was actually showing green) 
		> 10:30a: I restart Hobbit and get the "stale alert"
message,
		> and it finally stops sending alerts.  Recovery email
was
		> never sent out, even though it is in the alert rules.
		>
		> On Nov 14, 2007 11:22 AM, Josh Luthman 
		> <josh at imaginenetworksllc.com> wrote:
		> > You're saying it went yellow, then green.  The log
tells
		> you it sent
		> > an alert when it was yellow. 
		> >
		> > I'm not sure I'm seeing the problem here =/  It sent
an alert first
		> > when it was yellow and another when it switch to
green to
		> inform you
		> > it recovered, correct? 
		> >
		> >
		> >
		> > On 11/14/07, Gary Baluha <gumby3203 at gmail.com>
wrote:
		> > > Yes, the alert history shows it went yellow and
then 5 
		> minutes later
		> > > recovered.  The web page is showing everything
correct.  However,
		> > > when I check the notifications.log file, I can see
that
		> it was still
		> > > sending alerts about it being yellow, even though
it was 
		> definitely
		> > > green.
		> > >
		> > > On Nov 14, 2007 10:38 AM, Josh Luthman
		> <josh at imaginenetworksllc.com>
		> > wrote: 
		> > > > Click on it the host's test and click on history
- was
		> red at all?
		> > > >
		> > > > Are the WWW pages updating?  Look in the top
right
		> corner of the 
		> > > > page
		> > once
		> > > > you click on the host's test link.
		> > > >
		> > > >
		> > > >
		> > > >  On 11/14/07, Gary Baluha < gumby3203 at gmail.com>
wrote:
		> > > > >
		> > > > >
		> > > > >
		> > > > > I noticed this morning that our Hobbit server
was sending out 
		> > > > > alerts for the process check for a machine
that was actually
		> > > > > shown as green on the hobbit  web page.  I
checked on the
		> > > > > monitored machine, and the alert was indeed
green, yet the 
		> > > > > server was still sending out emails as though
it were in a
		> > > > > yellow state.  I restarted the Hobbit client
on the monitored
		> > > > > machine, and then restarted the Hobbit server
on the server. 
		> > > > > After doing this, I noticed the following in
the page.log
		> > > > > file:
		> > > > >
		> > > > > 2007-11-14 10:26:13 Stale alert for
host-name:procs dropped 
		> > > > >
		> > > > > (I changed the actual host name to "host-name"
to protect the
		> > innocent)
		> > > > > What exactly does this mean?  Before I
restarted the Hobbit 
		> > > > > server process, I manually edited the
alert.chk temp file and
		> > > > > removed the erroneous alert, but that didn't
correct the
		> > > > > problem.  It was only after I restarted the
Hobbit 
		> server process that it cleared the alert.
		> > > > > Is this a bug in the 4.2.0 code, or is there
something else
		> > > > > going on here?
		>
		
		Gary,
		
		We get those too, along with leftover semaphores and
shared memory 
		segments when we stop the hobbit server.  The snapshot,
seems much
		better, but with tooltips and host descriptions pushing
our display to
		the far right of the screen we cannot use it right now.
		
		~David
		
		Ps. Just to let you know it's not just your setup.
		
		To unsubscribe from the hobbit list, send an e-mail to
		hobbit-unsubscribe at hswn.dk
		
		
		




	-- 
	Josh Luthman
	Office: 937-552-2340
	Direct: 937-552-2343
	1100 Wayne St
	Suite 1337
	Troy, OH 45373
	
	Those who don't understand UNIX are condemned to reinvent it,
poorly. 
	--- Henry Spencer 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20071115/3300647e/attachment.html>


More information about the Xymon mailing list