[hobbit] hobbitd_alert crashes

Dominique Frise Dominique.Frise at unil.ch
Fri Jun 2 08:19:10 CEST 2006


Dominique Frise wrote:
> Henrik Stoerner wrote:
> 
>> On Fri, Jun 02, 2006 at 07:43:13AM +0200, Dominique Frise wrote:
>>
>>> (gdb) bt
>>> #0  0xff1a05c8 in _libc_kill () from /usr/lib/libc.so.1
>>> #1  0xff136d58 in abort () from /usr/lib/libc.so.1
>>> #2  0x0002134c in sigsegv_handler (signum=0) at sig.c:57
>>> #3  <signal handler called>
>>
>>
>>
>> Hrm, that isn't much to go on. Does it crash right away when you start 
>> Hobbit, or only after some time has passed ?
>>
> It crashed 3 times last night. Hobbit was last restarted yesterday at 
> 05:10 PM
> 
>> If it crashes right away, I'd like a copy of your bb-hosts,
>> hobbitserver.cfg and hobbit-alerts.cfg files. If it crashes
>> after some time, could you add the "--debug" option to
>> the hobbitd_alert command in hobbitlaunch.cfg, and then mail
>> me the ~hobbit/server/logs/page.log file after it has crashed?
>>
> Done.
> I'll mail you the log asap.
> 
> Thank you.
> 
> Dominique
> UNIL - University of Lausanne
> 
> To unsubscribe from the hobbit list, send an e-mail to
> hobbit-unsubscribe at hswn.dk
> 
> 
Looking at the event log, I noticed that the 3 times that hobbitd_alert 
crashed, it was trying to send to an IGNORE recipient (not always the same).
Here are our IGNORE rules after macros definitions at top of hobbit-alerts.cfg. 
Maybe there is something wrong with this configuration?

...
...
#---------------------------------------
# Hosts groups
#
$SAP_HOSTS=quartz,topaze,onyx,its,tulp,zircon
$ADMIN_HOSTS=bilbo,falco,furio

#------------------------------------------------------------------------------
# Rules to exclude alerting during a period of time
#------------------------------------------------------------------------------

HOST=* SERVICE=bckp TIME=*:2000:0700
    IGNORE
HOST=uldns1,uldns2 SERVICE=ldap TIME=*:0500:0530
    IGNORE
HOST=kawa,kawa2 SERVICE=http TIME=*:2210:2235
    IGNORE
HOST=gaia SERVICE=http TIME=*:0012:0015
    IGNORE
HOST=balrog,godzilla,smaug SERVICE=cpu TIME=*:2000:2359
    IGNORE
HOST=acsls,balrog,godzilla,smaug SERVICE=memory TIME=*:0600:0800
    IGNORE
HOST=unimedia,unimediad SERVICE=orcl,http TIME=*:0655:1115
    IGNORE
HOST=virtuavd SERVICE=orcl TIME=*:0001:0400
    IGNORE
HOST=tstvirtua SERVICE=orcl TIME=*:2159:0200
    IGNORE
HOST=ged SERVICE=http TIME=*:0305:0315
    IGNORE
HOST=$SAP_HOSTS SERVICE=conn,cpu,http,ftp TIME=*:1900:0700
    IGNORE
HOST=$SAP_HOSTS SERVICE=orcl,procs TIME=*:1900:2359
    IGNORE
HOST=$ADMIN_HOSTS SERVICE=http,sslcert TIME=*:0030:0630
    IGNORE
HOST=$ADMIN_HOSTS SERVICE=conn,cpu,http,sslcert TIME=*:1800:0700
    IGNORE
HOST=esope SERVICE=http,orcl TIME=*:0355:0600
    IGNORE
HOST=pcsan SERVICE=msgs,svcs,procs TIME=*:1955:2300
    IGNORE
HOST=iris SERVICE=hobbitd TIME=*:0310:0320
    IGNORE
HOST=lanfeust,winup TIME=*:1945:2200
    IGNORE
...
...


Dominique
UNIL - University of Lausanne



More information about the Xymon mailing list