[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [hobbit] hobbitd_alert crashes
Dominique Frise wrote:
Henrik Stoerner wrote:
On Fri, Jun 02, 2006 at 07:43:13AM +0200, Dominique Frise wrote:
(gdb) bt
#0 0xff1a05c8 in _libc_kill () from /usr/lib/libc.so.1
#1 0xff136d58 in abort () from /usr/lib/libc.so.1
#2 0x0002134c in sigsegv_handler (signum=0) at sig.c:57
#3 <signal handler called>
Hrm, that isn't much to go on. Does it crash right away when you start
Hobbit, or only after some time has passed ?
It crashed 3 times last night. Hobbit was last restarted yesterday at
05:10 PM
If it crashes right away, I'd like a copy of your bb-hosts,
hobbitserver.cfg and hobbit-alerts.cfg files. If it crashes
after some time, could you add the "--debug" option to
the hobbitd_alert command in hobbitlaunch.cfg, and then mail
me the ~hobbit/server/logs/page.log file after it has crashed?
Done.
I'll mail you the log asap.
Thank you.
Dominique
UNIL - University of Lausanne
To unsubscribe from the hobbit list, send an e-mail to
hobbit-unsubscribe (at) hswn.dk
Looking at the event log, I noticed that the 3 times that hobbitd_alert
crashed, it was trying to send to an IGNORE recipient (not always the same).
Here are our IGNORE rules after macros definitions at top of hobbit-alerts.cfg.
Maybe there is something wrong with this configuration?
...
...
#---------------------------------------
# Hosts groups
#
$SAP_HOSTS=quartz,topaze,onyx,its,tulp,zircon
$ADMIN_HOSTS=bilbo,falco,furio
#------------------------------------------------------------------------------
# Rules to exclude alerting during a period of time
#------------------------------------------------------------------------------
HOST=* SERVICE=bckp TIME=*:2000:0700
IGNORE
HOST=uldns1,uldns2 SERVICE=ldap TIME=*:0500:0530
IGNORE
HOST=kawa,kawa2 SERVICE=http TIME=*:2210:2235
IGNORE
HOST=gaia SERVICE=http TIME=*:0012:0015
IGNORE
HOST=balrog,godzilla,smaug SERVICE=cpu TIME=*:2000:2359
IGNORE
HOST=acsls,balrog,godzilla,smaug SERVICE=memory TIME=*:0600:0800
IGNORE
HOST=unimedia,unimediad SERVICE=orcl,http TIME=*:0655:1115
IGNORE
HOST=virtuavd SERVICE=orcl TIME=*:0001:0400
IGNORE
HOST=tstvirtua SERVICE=orcl TIME=*:2159:0200
IGNORE
HOST=ged SERVICE=http TIME=*:0305:0315
IGNORE
HOST=$SAP_HOSTS SERVICE=conn,cpu,http,ftp TIME=*:1900:0700
IGNORE
HOST=$SAP_HOSTS SERVICE=orcl,procs TIME=*:1900:2359
IGNORE
HOST=$ADMIN_HOSTS SERVICE=http,sslcert TIME=*:0030:0630
IGNORE
HOST=$ADMIN_HOSTS SERVICE=conn,cpu,http,sslcert TIME=*:1800:0700
IGNORE
HOST=esope SERVICE=http,orcl TIME=*:0355:0600
IGNORE
HOST=pcsan SERVICE=msgs,svcs,procs TIME=*:1955:2300
IGNORE
HOST=iris SERVICE=hobbitd TIME=*:0310:0320
IGNORE
HOST=lanfeust,winup TIME=*:1945:2200
IGNORE
...
...
Dominique
UNIL - University of Lausanne