[hobbit] hobbitd_rrd is not looking good
Bob Ababurko
bob at phreakout.net
Fri Nov 11 19:56:14 CET 2005
Henrik Stoerner wrote:
>On Fri, Nov 11, 2005 at 03:58:20AM -0500, Bob Ababurko wrote:
>
>
>>I have no real idea what I broke, but maybe someone can tell me. I made
>>a change in my bb-services file as I was trying to define a different
>>smtp service that expected a different return value than default smtp.
>>I called it something atypical and things started to turn red and when I
>>clicked on the red faces, there were Internal Server Error messages.
>>
>>
>
>Changing bb-services should not break things like that, so I'm pretty
>sure this is a coincidence. Or at least - you're not to blame for
>hobbitd crashing :-)
>
>
>
>>About that same time, the hobbitd and hobbitd_rrd turned red. I
>>immediately changed the values back to where they where(removed the
>>additional smtp definition as well as removed the the reference in
>>bb-hosts) and then restarted hobbit. The hobbitd_rrd does not seem to
>>be coming back.
>>
>>So right now, hobbitd_rrd is purple and when you click to get more
>>information, it says 'Program Crashed Fatal Signal Caught'
>>
>>
>
>If it is purple, it is safe to remove it with the command
> bb 127.0.0.1 "drop HOBBIT.SERVER.HOSTNAME hobbitd_rrd"
>
>The reason it ends up being purple is because normally hobbitd_rrd will
>not generate any status column. The only time it does is when it
>crashes; it's a kind of "Mayday" signal to make sure you notice that
>something bad has happened, and alert me to this.
>
>You can always check the "ps" listing and see if there are any hobbitd_rrd
>processes running - a standard install will have two of them, plus two
>hobbitd_channel processes feeding them.
>
>henrik at osiris:~$ ps -u hobbit
> PID TTY TIME CMD
> 10756 ? 00:00:00 hobbitlaunch
> 10757 ? 00:02:00 hobbitd
> 10762 ? 00:00:07 hobbitd_channel
> 10763 ? 00:00:01 hobbitd_filestore
> 10764 ? 00:00:00 hobbitd_channel
> 10765 ? 00:00:01 hobbitd_channel
> 10776 ? 00:00:00 hobbitd_alert
> 10778 ? 00:00:00 hobbitd_history
> 11581 ? 00:00:07 hobbitd_channel
> 11582 ? 00:00:05 hobbitd_rrd
> 11583 ? 00:00:01 hobbitd_channel
> 11699 ? 00:00:01 hobbitd_channel
> 11700 ? 00:00:05 hobbitd_client
> 11584 ? 00:00:02 hobbitd_rrd
> 21402 ? 00:00:00 sh
> 21403 ? 00:00:00 vmstat
>
>
>
>>rrd-data.log at around the time that this happened there is an entry
>>that says, '2005-11-11 01:33:35 Worker process died with exit code 134,
>>terminating'. I am not sure if that is related. I do not seem to have
>>any COREFILES in my tmp dir unless they may have been erased when I
>>restarted hobbit....probably not, but I don't know.
>>
>>
>
>They are not erased automatically, so they ought to be there ... could
>you run a
> find ~hobbit -name "core*"
>and see if anything shows up ?
>
>
>Regards,
>Henrik
>
>
>To unsubscribe from the hobbit list, send an e-mail to
>hobbit-unsubscribe at hswn.dk
>
>
>
>
>
>
Ok, maybe I have gotten mixed in the name of expected corefile. I DO
have a file that is called hobbitd_rrd.core. Now, it looks like it was
created at the time of 'the incident'....so I thin what I am looking
for. I was actually looking for something called COREFILE. I must have
misread. Ok, now I cannot seem to find the web page that showed what
to do to review a corefile in tmp. Does anyone know what I should be
doing to to read these files.
Now, is taking hobbitd_rrd out of the monitoring checks what I want?
Dont I want/need it in there for a complete hobbit.....fixed, of
course? I want my hobbit to be a healthy and fully funtional hobbit. I
guess I am curious what went wrong and how to fix it. Maybe this has
something to do with the COREFILE.....which I need to fugure out how to
read so I can figure out why it crashed. Am I right here? Sounds logical.
I do have two hobbitd_rrd processes running, but I only checked for two
after removing the hobbitd_rrd from being checked. I actually do not
remember seeing two last night, but it is dsitinctly possible.
-Bob
More information about the Xymon
mailing list