[hobbit] hobbitd_rrd is not looking good
Henrik Stoerner
henrik at hswn.dk
Fri Nov 11 14:10:04 CET 2005
On Fri, Nov 11, 2005 at 03:58:20AM -0500, Bob Ababurko wrote:
> I have no real idea what I broke, but maybe someone can tell me. I made
> a change in my bb-services file as I was trying to define a different
> smtp service that expected a different return value than default smtp.
> I called it something atypical and things started to turn red and when I
> clicked on the red faces, there were Internal Server Error messages.
Changing bb-services should not break things like that, so I'm pretty
sure this is a coincidence. Or at least - you're not to blame for
hobbitd crashing :-)
> About that same time, the hobbitd and hobbitd_rrd turned red. I
> immediately changed the values back to where they where(removed the
> additional smtp definition as well as removed the the reference in
> bb-hosts) and then restarted hobbit. The hobbitd_rrd does not seem to
> be coming back.
>
> So right now, hobbitd_rrd is purple and when you click to get more
> information, it says 'Program Crashed Fatal Signal Caught'
If it is purple, it is safe to remove it with the command
bb 127.0.0.1 "drop HOBBIT.SERVER.HOSTNAME hobbitd_rrd"
The reason it ends up being purple is because normally hobbitd_rrd will
not generate any status column. The only time it does is when it
crashes; it's a kind of "Mayday" signal to make sure you notice that
something bad has happened, and alert me to this.
You can always check the "ps" listing and see if there are any hobbitd_rrd
processes running - a standard install will have two of them, plus two
hobbitd_channel processes feeding them.
henrik at osiris:~$ ps -u hobbit
PID TTY TIME CMD
10756 ? 00:00:00 hobbitlaunch
10757 ? 00:02:00 hobbitd
10762 ? 00:00:07 hobbitd_channel
10763 ? 00:00:01 hobbitd_filestore
10764 ? 00:00:00 hobbitd_channel
10765 ? 00:00:01 hobbitd_channel
10776 ? 00:00:00 hobbitd_alert
10778 ? 00:00:00 hobbitd_history
11581 ? 00:00:07 hobbitd_channel
11582 ? 00:00:05 hobbitd_rrd
11583 ? 00:00:01 hobbitd_channel
11699 ? 00:00:01 hobbitd_channel
11700 ? 00:00:05 hobbitd_client
11584 ? 00:00:02 hobbitd_rrd
21402 ? 00:00:00 sh
21403 ? 00:00:00 vmstat
> rrd-data.log at around the time that this happened there is an entry
> that says, '2005-11-11 01:33:35 Worker process died with exit code 134,
> terminating'. I am not sure if that is related. I do not seem to have
> any COREFILES in my tmp dir unless they may have been erased when I
> restarted hobbit....probably not, but I don't know.
They are not erased automatically, so they ought to be there ... could
you run a
find ~hobbit -name "core*"
and see if anything shows up ?
Regards,
Henrik
More information about the Xymon
mailing list