[hobbit] hobbitd_rrd is not looking good

Henrik Stoerner henrik at hswn.dk
Fri Nov 11 14:10:04 CET 2005


On Fri, Nov 11, 2005 at 03:58:20AM -0500, Bob Ababurko wrote:
> I have no real idea what I broke, but maybe someone can tell me.  I made 
> a change in my bb-services file as I was trying to define a different 
> smtp service that expected a different return value than default smtp.  
> I called it something atypical and things started to turn red and when I 
> clicked on the red faces, there were Internal Server Error messages.  

Changing bb-services should not break things like that, so I'm pretty
sure this is a coincidence. Or at least - you're not to blame for
hobbitd crashing :-)

> About that same time, the hobbitd and hobbitd_rrd turned red.  I 
> immediately changed the values back to where they where(removed the 
> additional smtp definition as well as removed the the reference in 
> bb-hosts) and then restarted hobbit.  The hobbitd_rrd does not seem to 
> be coming back. 
> 
> So right now, hobbitd_rrd is purple and when you click to get more 
> information, it says 'Program Crashed   Fatal Signal Caught'

If it is purple, it is safe to remove it with the command
   bb 127.0.0.1 "drop HOBBIT.SERVER.HOSTNAME hobbitd_rrd"

The reason it ends up being purple is because normally hobbitd_rrd will
not generate any status column. The only time it does is when it
crashes; it's a kind of "Mayday" signal to make sure you notice that
something bad has happened, and alert me to this.

You can always check the "ps" listing and see if there are any hobbitd_rrd
processes running - a standard install will have two of them, plus two
hobbitd_channel processes feeding them.

henrik at osiris:~$ ps -u hobbit
  PID TTY          TIME CMD
  10756 ?        00:00:00 hobbitlaunch
  10757 ?        00:02:00 hobbitd
  10762 ?        00:00:07 hobbitd_channel
  10763 ?        00:00:01 hobbitd_filestore
  10764 ?        00:00:00 hobbitd_channel
  10765 ?        00:00:01 hobbitd_channel
  10776 ?        00:00:00 hobbitd_alert
  10778 ?        00:00:00 hobbitd_history
  11581 ?        00:00:07 hobbitd_channel
  11582 ?        00:00:05 hobbitd_rrd
  11583 ?        00:00:01 hobbitd_channel
  11699 ?        00:00:01 hobbitd_channel
  11700 ?        00:00:05 hobbitd_client
  11584 ?        00:00:02 hobbitd_rrd
  21402 ?        00:00:00 sh
  21403 ?        00:00:00 vmstat

> rrd-data.log at around the time that this happened there is an entry 
> that says, '2005-11-11 01:33:35 Worker process died with exit code 134, 
> terminating'.  I am not sure if that is related.  I do not seem to have 
> any COREFILES in my tmp dir unless they may have been erased when I 
> restarted hobbit....probably not, but I don't know.

They are not erased automatically, so they ought to be there ... could
you run a 
    find ~hobbit -name "core*"
and see if anything shows up ?


Regards,
Henrik




More information about the Xymon mailing list