<br><font size=3><tt>> On Mon, Mar 13, 2006 at 12:45:27PM -0500, James
B Horwath wrote:<br>
> > I have been running hobbit for several months now without incident.
I am <br>
> > running hobbit 4.1.2p1 on Redhat Enterprise 3 on IBM pseries
hardware. I <br>
> > haven't had any issues until this morning. Now it appears
after about one <br>
> > hour of running the system flat out dies. I am sent a notification
for <br>
> > every system connected. Then it appears the network process
dies. I was <br>
> > running Tcpdump to see what was wrong. I see the completion of
a network <br>
> > test about 30 minutes ago to a machine on the same subnet. I
am not <br>
> > running iptables/ipchains. I am not experienced at hard-core
hobbit <br>
> > debugging. I looked in /var/log/hobbit and don't see anything
strange. <br>
> > There are no core files on the hobbit directory.<br>
> > <br>
> > Any advise on where to start? All my network test are now
purple.<br>
> <br>
> Is there a "bbtest-net" and/or "fping" process
which hangs ? If there<br>
> is, it would be interesting to attach to it with "gdb" and
see what<br>
> it is doing. Alternatively, kill it with a "kill -6" which
will trigger <br>
> a core dump in ~hobbit/data/tmp/ - you can run the core dump through
<br>
> gdb, which might give me an idea what it is doing.<br>
> <br>
> <br>
> You can also try su'ing to the hobbit user and run the command<br>
> <br>
> bbcmd bbtest-net --debug host1 host2<br>
> <br>
> (replace "host1" and "host2" with a couple of
the hosts in your<br>
> bb-hosts file).<br>
> <br>
> <br>
> Is DNS lookups working on this box ? That is one of the few things
that<br>
> can cause the network tests to slow down dramatically. But they ought
to<br>
> time out automatically. Same goes for the other commands that run
as<br>
> part of the network tests (rpc and ntp queries).<br>
</tt></font>
<br><font size=3><tt>Henrik,</tt></font>
<br>
<br><font size=3><tt>Thanks for the tips. DNS works fine and the
network tests seems to work fine when I try them. My system is pretty
much idle and I don't see anything nasty in the system logs. I have
included the process table and an lsof bbtest-net process. When I
did the kill -6 on the network process it worked once and then failed stopped
again. I did a strings on the core and may have found a machine with
a slove DNS resolution. I am keeping my fingers crossed.</tt></font>
<br>
<br><font size=3><tt>Regards,</tt></font>
<br><font size=3><tt>Jim<br>
</tt></font><font size=2 face="sans-serif"><br>
[root@bigbrother etc]# ps -ef | grep hobbit</font>
<br>
<br><font size=2 face="sans-serif">hobbit 18470 1
0 15:10 ? 00:00:00 /usr/local/hobbit/server/bin/hobbitlaunch
--config=/usr/local/hobbit/server/etc/hobbitlaunch.cfg --env=/usr/local/hobbit/server/etc/hobbitserver.cfg
--log=/var/log/hobbit/hobbitlaunch.log --pidfile=/var/log/hobbit/hobbitlaunch.pid</font>
<br><font size=2 face="sans-serif">hobbit 18471 18470 0 15:10
? 00:00:05 hobbitd --pidfile=/var/log/hobbit/hobbitd.pid
--restart=/usr/local/hobbit/server/tmp/hobbitd.chk --checkpoint-file=/usr/local/hobbit/server/tmp/hobbitd.chk
--checkpoint-interval=600 --log=/var/log/hobbit/hobbitd.log --admin-senders=127.0.0.1
10.98.200.46</font>
<br><font size=2 face="sans-serif">hobbit 18473 18470 0 15:10
? 00:00:00 hobbitd_channel --channel=stachg
--log=/var/log/hobbit/history.log hobbitd_history</font>
<br><font size=2 face="sans-serif">hobbit 18474 18473 0 15:10
? 00:00:00 hobbitd_history</font>
<br><font size=2 face="sans-serif">hobbit 18475 18470 0 15:10
? 00:00:01 hobbitd_channel --channel=page --log=/var/log/hobbit/page.log
hobbitd_alert --checkpoint-file=/usr/local/hobbit/server/tmp/alert.chk
--checkpoint-interval=600</font>
<br><font size=2 face="sans-serif">hobbit 18476 18475 0 15:10
? 00:00:00 hobbitd_alert --checkpoint-file=/usr/local/hobbit/server/tmp/alert.chk
--checkpoint-interval=600</font>
<br><font size=2 face="sans-serif">hobbit 18477 18470 0 15:10
? 00:00:19 hobbitd_channel --channel=status
--log=/var/log/hobbit/rrd-status.log hobbitd_rrd --rrddir=/usr/local/hobbit/rrd</font>
<br><font size=2 face="sans-serif">hobbit 18478 18470 0 15:10
? 00:00:00 hobbitd_channel --channel=data --log=/var/log/hobbit/rrd-data.log
hobbitd_rrd --rrddir=/usr/local/hobbit/rrd</font>
<br><font size=2 face="sans-serif">hobbit 18479 18470 0 15:10
? 00:00:00 hobbitd_channel --channel=client
--log=/var/log/hobbit/clientdata.log hobbitd_client</font>
<br><font size=2 face="sans-serif">hobbit 18480 18478 0 15:10
? 00:00:00 hobbitd_rrd --rrddir=/usr/local/hobbit/rrd</font>
<br><font size=2 face="sans-serif">hobbit 18481 18477 0 15:10
? 00:00:14 hobbitd_rrd --rrddir=/usr/local/hobbit/rrd</font>
<br><font size=2 face="sans-serif">hobbit 18482 18479 0 15:10
? 00:00:00 hobbitd_client</font>
<br><font size=2 face="sans-serif">hobbit 18634 18470 0 15:20
? 00:00:00 bbtest-net --report --ping --checkresponse
--timeout=60 --debug</font>
<br><font size=2 face="sans-serif">hobbit 21820 1
0 22:02 ? 00:00:00 sh -c vmstat 300 2
1>/usr/local/hobbit/client/tmp/hobbit_vmstat.21809 2>&1; mv /usr/local/hobbit/client/tmp/hobbit_vmstat.21809
/usr/local/hobbit/client/tmp/hobbit_vmstat</font>
<br><font size=2 face="sans-serif">hobbit 21821 21820 0 22:02
? 00:00:00 vmstat 300 2</font>
<br><font size=2 face="sans-serif">root 21861 21698 0
22:06 pts/0 00:00:00 grep hobbit</font>
<br>
<br><font size=2 face="sans-serif">[root@bigbrother etc]# lsof -p 18634</font>
<br><font size=2 face="sans-serif">COMMAND PID USER
FD TYPE DEVICE SIZE NODE NAME</font>
<br><font size=2 face="sans-serif">bbtest-ne 18634 hobbit cwd
DIR 8,3 4096 376833 /usr/local/hobbit/server</font>
<br><font size=2 face="sans-serif">bbtest-ne 18634 hobbit rtd
DIR 8,11 4096 2 /</font>
<br><font size=2 face="sans-serif">bbtest-ne 18634 hobbit txt
REG 8,3 170076 393236 /usr/local/hobbit/server/bin/bbtest-net</font>
<br><font size=2 face="sans-serif">bbtest-ne 18634 hobbit mem
REG 8,11 61504 80026 /lib/libnss_files-2.3.2.so</font>
<br><font size=2 face="sans-serif">bbtest-ne 18634 hobbit mem
REG 8,11 14592 80056 /lib/liblaus.so.1.0.0</font>
<br><font size=2 face="sans-serif">bbtest-ne 18634 hobbit mem
REG 8,11 39468 82638 /lib/libpam.so.0.75</font>
<br><font size=2 face="sans-serif">bbtest-ne 18634 hobbit mem
REG 8,11 29100 80014 /lib/libcrypt-2.3.2.so</font>
<br><font size=2 face="sans-serif">bbtest-ne 18634 hobbit mem
REG 8,7 28672 144356 /usr/lib/libgdbm.so.2.0.0</font>
<br><font size=2 face="sans-serif">bbtest-ne 18634 hobbit mem
REG 8,7 59608 144390 /usr/lib/libz.so.1.1.4</font>
<br><font size=2 face="sans-serif">bbtest-ne 18634 hobbit mem
REG 8,11 19992 80016 /lib/libdl-2.3.2.so</font>
<br><font size=2 face="sans-serif">bbtest-ne 18634 hobbit mem
REG 8,11 79916 80036 /lib/libresolv-2.3.2.so</font>
<br><font size=2 face="sans-serif">bbtest-ne 18634 hobbit mem
REG 8,7 78360 272188 /usr/kerberos/lib/libk5crypto.so.3.0</font>
<br><font size=2 face="sans-serif">bbtest-ne 18634 hobbit mem
REG 8,7 11072 272178 /usr/kerberos/lib/libcom_err.so.3.0</font>
<br><font size=2 face="sans-serif">bbtest-ne 18634 hobbit mem
REG 8,7 391564 272198 /usr/kerberos/lib/libkrb5.so.3.1</font>
<br><font size=2 face="sans-serif">bbtest-ne 18634 hobbit mem
REG 8,7 77448 272184 /usr/kerberos/lib/libgssapi_krb5.so.2.2</font>
<br><font size=2 face="sans-serif">bbtest-ne 18634 hobbit mem
REG 8,7 57768 144429 /usr/lib/libsasl.so.7.1.11</font>
<br><font size=2 face="sans-serif">bbtest-ne 18634 hobbit mem
REG 8,11 1608896 32013 /lib/tls/libc-2.3.2.so</font>
<br><font size=2 face="sans-serif">bbtest-ne 18634 hobbit mem
REG 8,11 1104580 80070 /lib/libcrypto.so.0.9.7a</font>
<br><font size=2 face="sans-serif">bbtest-ne 18634 hobbit mem
REG 8,11 220772 80071 /lib/libssl.so.0.9.7a</font>
<br><font size=2 face="sans-serif">bbtest-ne 18634 hobbit mem
REG 8,7 49304 144433 /usr/lib/liblber.so.2.0.17</font>
<br><font size=2 face="sans-serif">bbtest-ne 18634 hobbit mem
REG 8,7 186348 144435 /usr/lib/libldap.so.2.0.17</font>
<br><font size=2 face="sans-serif">bbtest-ne 18634 hobbit mem
REG 8,11 115228 80005 /lib/ld-2.3.2.so</font>
<br><font size=2 face="sans-serif">bbtest-ne 18634 hobbit 0r
CHR 1,3 65675 /dev/null</font>
<br><font size=2 face="sans-serif">bbtest-ne 18634 hobbit 1w
REG 8,6 5775484 432036 /var/log/hobbit/bb-network.log</font>
<br><font size=2 face="sans-serif">bbtest-ne 18634 hobbit 2w
REG 8,6 5775484 432036 /var/log/hobbit/bb-network.log</font>
<br><font size=2 face="sans-serif">bbtest-ne 18634 hobbit 3u
IPv4 219456 UDP bigbrother:35123->n9000sd1.nro.glic.com:domain</font>
<HTML><BODY><P><hr size=1></P><br>
<P><STRONG><br>
This message, and any attachments to it, may contain information<br>
that is privileged, confidential, and exempt from disclosure under<br>
applicable law. If the reader of this message is not the intended<br>
recipient, you are notified that any use, dissemination,<br>
distribution, copying, or communication of this message is strictly<br>
prohibited. If you have received this message in error, please<br>
notify the sender immediately by return e-mail and delete the<br>
message and any attachments. Thank you.<br>
</STRONG></P></BODY></HTML>