dbcheck for RAC connections refused

martin.flemming at desy.de martin.flemming at desy.de
Wed Jul 29 21:29:21 CEST 2009


Hi !

I want to use dbcheck.pl from

http://sourceforge.net/projects/hobbit-perl-cl/

.. but unfortunatley something is going wrong, i get only
"Connection Refused" Messages and no data  .. :-(

My config is below:

bb-hosts:

0.0.0.0 atlas_tag_db    # noconn dbcheck
0.0.0.0 atlast1         # noconn dbcheck
0.0.0.0 atlast2        # noconn dbcheck
0.0.0.0 atlast3        # noconn dbcheck
0.0.0.0 atlast4        # noconn dbcheck


dbcheck.ini:

oraclehome              = /opt/products/oracle-client/10.2g/
username                = XXXX
password                = XXXXXXXXXXXXXXXXXXXXXXX


[atlas_tag_db]
dbtype                  = Oracle
oraclerac               = yes
hostname                = lcg3d-a-v-4   # hostname of the rac instance n.2
sid                     = atlas_tag_db  # sid of the rac instance n.2
port                    = 1521          # port of listener on rac instance n.2
username                = XXXX
password                = XXXXXXXXXXXX

[atlast1]
dbtype                  = Oracle
hostname                = lcg3d-a-v-1   # hostname of the rac instance n.2
sid                     = atlast1       # sid of the rac instance n.2
port                    = 1521          # port of listener on rac instance n.2
username                = XXXX
password                = XXXXXXXXXXX

[atlast2]
dbtype                  = Oracle
hostname                = lcg3d-a-v-2   # hostname of the rac instance n.2
sid                     = atlast2       # sid of the rac instance n.2
port                    = 1521          # port of listener on rac instance n.2
username                = XXXXX
password                = XXXXXXXXXXXXXXX


[atlast3]
dbtype                  = Oracle
hostname                = lcg3d-a-v-3   # hostname of the rac instance n.2
sid                     = atlast3       # sid of the rac instance n.2
port                    = 1521          # port of listener on rac instance n.2
username                = XXXXXX
password                = XXXXXXXXXXXXXXXX

[atlast4]
dbtype                  = Oracle
hostname                = lcg3d-a-v-4   # hostname of the rac instance n.2
sid                     = atlast4       # sid of the rac instance n.2
port                    = 1521          # port of listener on rac instance n.2
username                = XXXXXXXX
password                = XXXXXXXXXXXXXXXXXXX


It's a RAC-Cluster with the instances on virtual-xen-maschines ...
The Xen-server for the virtual machines are running with oracle-(redhat)-enterprise-linux :

Enterprise Linux Enterprise Linux Server release 5.2 (Carthage)
Linux lcg3d-a-1.desy.de 2.6.18-92.el5xen #1 SMP Fri May 23 23:49:15 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux

.. don't know, if that a important point for my problem ...

The dbcheck-error-output is like

bin/bbcmd ext/dbcheck.pl fast -dd

2009-07-29 21:09:50 Using default environment file /usr/lib/hobbit/server/etc/hobbitserver.cfg

.
.
.
Wed Jul 29 21:09:50 2009:CALL : Hobbit_fd_lib::exec_local_cmd line 429
Wed Jul 29 21:09:50 2009:DEBUG:RESULT= 0, COMMAND= /usr/lib/hobbit/server/bin/bbhostgrep dbcheck ,VALUE=0.0.0.0 atlas_tag_db # dbcheck
  0.0.0.0 atlast1 # dbcheck
Wed Jul 29 21:09:50 2009:CALL : Hobbit_fd_lib::exec_local_cmd line 429
Wed Jul 29 21:09:50 2009:DEBUG:RESULT= 0, COMMAND= /usr/lib/hobbit/server/bin/bbhostgrep dbcheck ,VALUE=0.0.0.0 atlas_tag_db # dbcheck
  0.0.0.0 atlast1 # dbcheck

.
.
.
Wed Jul 29 21:09:50 2009:DEBUG: hostadress: atlast1
Wed Jul 29 21:09:50 2009:DEBUG: mode=1, tocheck=0, testmode=0 
data0=uptime, data1=status+60
Wed Jul 29 21:09:50 2009:CALL : Hobbit_fd_lib::check_test line 1048
Wed Jul 29 21:09:50 2009:DEBUG: mode=1, tocheck=0, testmode=0 
data0=DBCheck, data1=status+60
Wed Jul 29 21:09:50 2009:CALL : Hobbit_fd_lib::load_test_config line 1182
Wed Jul 29 21:09:50 2009:DEBUG: LINE=fulltest, EVENT=DBCheck, 
CHECK=status+60
Wed Jul 29 21:09:50 2009:CALL : Hobbit_fd_lib::check_test line 1048
Wed Jul 29 21:09:50 2009:DEBUG: mode=1, tocheck=0, testmode=0 
data0=ChkConn, data1=status+60
Wed Jul 29 21:09:50 2009:CALL : Hobbit_fd_lib::load_test_config line 1182
Wed Jul 29 21:09:50 2009:DEBUG: LINE=conn, EVENT=ChkConn, CHECK=status+60
Wed Jul 29 21:09:50 2009:CALL : Hobbit_fd_lib::check_test line 1048
Wed Jul 29 21:09:50 2009:DEBUG: mode=1, tocheck=1, testmode=0 data0=Audit, 
data1=notest
Wed Jul 29 21:09:50 2009:CALL : Hobbit_fd_lib::load_test_config line 1182
Wed Jul 29 21:09:50 2009:DEBUG: LINE=audit, EVENT=Audit, CHECK=notest
Wed Jul 29 21:09:50 2009:CALL : Hobbit_fd_lib::check_test line 1048
Wed Jul 29 21:09:50 2009:DEBUG: mode=1, tocheck=1, testmode=0 
data0=TblSpace, data1=notest
Wed Jul 29 21:09:50 2009:CALL : Hobbit_fd_lib::load_test_config line 1182
Wed Jul 29 21:09:50 2009:DEBUG: LINE=tablespace, EVENT=TblSpace, 
CHECK=notest
Wed Jul 29 21:09:50 2009:CALL : Hobbit_fd_lib::check_test line 1048
Wed Jul 29 21:09:50 2009:DEBUG: mode=1, tocheck=1, testmode=0 
data0=Extent, data1=notest
Wed Jul 29 21:09:50 2009:CALL : Hobbit_fd_lib::load_test_config line 1182
Wed Jul 29 21:09:50 2009:DEBUG: LINE=extent, EVENT=Extent, CHECK=notest
Wed Jul 29 21:09:50 2009:CALL : Hobbit_fd_lib::check_test line 1048
Wed Jul 29 21:09:50 2009:CALL : Hobbit_fd_lib::check_test line 276
Wed Jul 29 21:09:50 2009:DEBUG: mode=1, tocheck=0, testmode=0 
data0=uptime, data1=status+60
Wed Jul 29 21:09:50 2009:DEBUG: mode=1, tocheck=1, testmode=0 
data0=HitCache, data1=notest
Wed Jul 29 21:09:50 2009:CALL : Hobbit_fd_lib::load_test_config line 1182
Wed Jul 29 21:09:50 2009:DEBUG: LINE=hitcache, EVENT=HitCache, 
CHECK=notest
Wed Jul 29 21:09:50 2009:CALL : Hobbit_fd_lib::check_test line 1048



Wed Jul 29 21:09:50 2009:DEBUG: LINE=unumber, EVENT=unumber, CHECK=status+60
Wed Jul 29 21:09:50 2009:ERROR on atlast1: Timeout connecting to DBI:Oracle:host=lcg3d-a-v-1   # hostname of the rac instance n.2;port=1521 
# port of listener on rac instance n.2;sid=atlast1       # sid of the rac instance n.2!
Wed Jul 29 21:09:50 2009:CALL : Hobbit_fd_lib::send_bb line 1109
Wed Jul 29 21:09:50 2009:DEBUG: TEST=conn, TYPE=status+60, EVENT=ChkConn, SENDTYPE=2, TESTLIVE=60, SUMREPTIME=0
Wed Jul 29 21:09:50 2009:CALL : Hobbit_fd_lib::send_bb line 1109
Wed Jul 29 21:09:50 2009:DEBUG: BBDISPLAY: 131.169.56.65, hostname: atlast1, type: status+60, event: ChkConn, time: 00:00:00, color=red
Wed Jul 29 21:09:50 2009:CALL : Hobbit_fd_lib::send_bb line 1109
Wed Jul 29 21:09:50 2009:DEBUG: COMPLETE MESSAGE:status+60 atlast1.CkConn red Wed Jul 29 21:09:50 2009 Timeout connecting to 
DBI:Oracle:host=lcg3d-a-v-1   # hostname of the rac instance n.2;port=1521  # port of listener on rac instance n.2;sid=atlast1       # sid of the rac 
instance n.2!

dbcheck.pl version 1.08 - column ChkConn lifetime 60, tested in ~ 00:00:00 (max 00:00:20)
Wed Jul 29 21:09:50 2009:CALL : Hobbit_fd_lib::send_bb line 339
Wed Jul 29 21:09:50 2009:DEBUG: TEST=unumber, TYPE=status+60, EVENT=unumber, SENDTYPE=2, TESTLIVE=60, SUMREPTIME=0
Wed Jul 29 21:09:50 2009:CALL : Hobbit_fd_lib::send_bb line 339
Wed Jul 29 21:09:50 2009:DEBUG: BBDISPLAY: 131.169.56.65, hostname: atlast1, type: status+60, event: unumber, time: 00:00:00, color=clear
Wed Jul 29 21:09:50 2009:CALL : Hobbit_fd_lib::send_bb line 339
Wed Jul 29 21:09:50 2009:DEBUG: COMPLETE MESSAGE:status+60 atlast1.unumber clear Wed Jul 29 21:09:50 2009 Connection Check Failed
dbcheck.pl version 1.08 - column unumber lifetime 60, tested in ~ 00:00:00 (max 00:00:20)
Wed Jul 29 21:09:50 2009:ERROR on atlas_tag_db: Timeout connecting to DBI:Oracle:host=lcg3d-a-v-4   # hostname of the rac instance n.2;port=1521 
# port of listener on rac instance n.2;sid=atlas_tag_db  # sid of the rac instance n.2!
Wed Jul 29 21:09:50 2009:CALL : Hobbit_fd_lib::send_bb line 1109
Wed Jul 29 21:09:50 2009:DEBUG: TEST=conn, TYPE=status+60, EVENT=ChkConn, SENDTYPE=2, TESTLIVE=60, SUMREPTIME=0
Wed Jul 29 21:09:50 2009:CALL : Hobbit_fd_lib::send_bb line 1109
Wed Jul 29 21:09:50 2009:DEBUG: BBDISPLAY: 131.169.56.65, hostname: atlas_tag_db, type: status+60, event: ChkConn, time: 00:00:00, color=red
Wed Jul 29 21:09:50 2009:CALL : Hobbit_fd_lib::send_bb line 1109
Wed Jul 29 21:09:50 2009:DEBUG: COMPLETE MESSAGE:status+60 atlas_tag_db.ChkConn red Wed Jul 29 21:09:50 2009 Timeout connecting to 
DBI:Oracle:host=lcg3d-a-v-4   # hostname of the rac instance n.2;port=1521          # port of listener on rac instance n.2;sid=atlas_tag_db  # sid of the rac 
instance n.2!


And status-page looks like

Test Time    = 00:00:01
Host Checked = 5
Status Msg   = 60

Colors                   Colors
clear                       55
red                          5

Events                   Events
Audit                        5
ChkConn                      5
Extent                       5
HitCache                     5
InvObj                       5
Locks                        5
MemReq                       5
RollBack                     5
Session                      5
TblSpace                     5
unumber                      5
uptime                       5

Types                     Types
status+60                   60

Hosts Summary
Hosts                     clear        red     Number      Times
atlas_tag_db                 11          1         12   00:00:00
atlast1                      11          1         12   00:00:00
atlast2                      11          1         12   00:00:00
atlast3                      11          1         12   00:00:00
atlast4                      11          1         12   00:00:00

Hosts Summary
Hosts                     Audit    ChkConn     Extent   HitCache     InvObj      Locks     MemReq   RollBack    Session   TblSpace    unumber     uptime
atlas_tag_db  1          1          1          1          1          1          1          1          1          1          1          1
atlast1                       1          1          1          1          1          1          1          1          1          1          1          1
atlast2                       1          1          1          1          1          1          1          1          1          1          1          1
atlast3                       1          1          1          1          1          1          1          1          1          1          1          1
atlast4                       1          1          1          1          1          1          1          1          1          1          1          1

Errors
No General Errors
atlas_tag_db
Wed Jul 29 18:52:47 2009:ERROR: Timeout connecting to DBI:Oracle:host=lcg3d-a-v-4   # hostname of the rac instance n.2;port=1521 
# port of listener on rac instance n.2;sid=atlas_tag_db  # sid of the rac instance n.2!
atlast1
Wed Jul 29 18:52:47 2009:ERROR: Timeout connecting to DBI:Oracle:host=lcg3d-a-v-1   # hostname of the rac instance n.2;port=1521 
# port of listener on rac instance n.2;sid=atlast1       # sid of the rac instance n.2!
atlast2
Wed Jul 29 18:52:47 2009:ERROR: Timeout connecting to DBI:Oracle:host=lcg3d-a-v-2   # hostname of the rac instance n.2;port=1521 
# port of listener on rac instance n.2;sid=atlast2       # sid of the rac instance n.2!
atlast3
Wed Jul 29 18:52:47 2009:ERROR: Timeout connecting to DBI:Oracle:host=lcg3d-a-v-3   # hostname of the rac instance n.2;port=1521 
# port of listener on rac instance n.2;sid=atlast3       # sid of the rac instance n.2!
atlast4
Wed Jul 29 18:52:47 2009:ERROR: Timeout connecting to DBI:Oracle:host=lcg3d-a-v-4   # hostname of the rac instance n.2;port=1521 
# port of listener on rac instance n.2;sid=atlast4       # sid of the rac instance n.2!

Warnings
No General Warnings
No Hosts Warnings


dbcheck.pl version 1.08 - column dbcheck lifetime 60, tested in ~ 00:00:01 (max 00:02:00)

I use the old and also the new version of dbcheck.pl ...

Any idea why could my enviroment be damaged ?

thanks & cheers

 	martin



More information about the Xymon mailing list