[hobbit] bbgen frequent yellow alerts - hobbitd problem?
Henrik Stoerner
henrik at hswn.dk
Mon Nov 6 17:29:59 CET 2006
On Mon, Nov 06, 2006 at 07:35:27AM -0800, Mr-Pope wrote:
> We are running a new installation of Hobbit 4.2 on Solaris 10 running
> in a non-global zone. Server is a v240 but I don't think that matters
> here.
>
> The problem here is that our bbgen status turns yellow with fairly
> high frequency, sometimes multiple times an hour, at (what seem like)
> random intervals. In the yellow alert bbgen reports:
> "hobbitd status-board not available"
The reports I've had of this only have one thing in common: They all
happen on Solaris 10. So I'm beginning to suspect that maybe Solaris
doesn't work quite the way other systems do.
Or perhaps there is a bug, and something special in Solaris triggers it.
> Below are the output from some commands/logs. These logs don't really
> seem to help, so let me know if there is anything else that I can send
> along to debug this issue.
> $BB --debug $BBDISP "hobbitdboard"
> (with no --debug on a 'failure' I get no output. I'm assuming this is
> the same cause of the bbgen yellow alert)
Yes.
> bbgen --debug --report (this one turned bbgen yellow/unavailable.
> Note the quick disconnect.)
> 2006-11-03 09:51:03 load_state()
> 2006-11-03 09:51:03 Transport setup is:
> 2006-11-03 09:51:03 bbdportnumber = 1984
> 2006-11-03 09:51:03 bbdispproxyhost = NONE
> 2006-11-03 09:51:03 bbdispproxyport = 0
> 2006-11-03 09:51:03 Recipient listed as '10.xxx.xxx.xxx'
> 2006-11-03 09:51:03 Standard BB protocol on port 1984
> 2006-11-03 09:51:03 Will connect to address 10.xxx.xxx.xxx port 1984
> 2006-11-03 09:51:03 Connect status is 0
> 2006-11-03 09:51:03 Sent 126 bytes
> 2006-11-03 09:51:03 Closing connection
Interesting.
Since it seems that this bites you more than most others, I'd like you
to do a couple of things for me to figure out what is going on. I need
you to add a couple of debugging lines to Hobbit.
First, in the bbdisplay/loaddata.c file, around line 436 you'll find the
code that prints out the "hobbitd status board not available" message.
It looks like this:
errprintf("hobbitd status-board not available\n");
I want you to change that to
errprintf("hobbitd status-board not available, code %d\n", hobbitdresult);
Next, in the lib/sendmsg.c file around line 340 is where the code is
that receives data from Hobbit. You'll find these lines:
n = recv(sockfd, recvbuf, sizeof(recvbuf)-1, 0);
if (n > 0) {
I'd like you to add 8 lines between these two:
n = recv(sockfd, recvbuf, sizeof(recvbuf)-1, 0);
if (n < 0) {
dbgprintf("recv() returned error: %s\n", strerror(errno));
if (errno == EAGAIN) continue;
}
if (n == 0) {
dbgprintf("recv() gave us 0 bytes\n");
continue;
}
if (n > 0) {
(it isn't the prettiest of programming, but it does the job for now).
After making these two changes, run "make clean; make" and copy the
bbdisplay/bbgen binary into your ~hobbit/server/bin/ directory. Let
Hobbit run as normal (with --debug on the bbgen command) and when it
fails I am very interested to see what's in the logfile.
Regards,
Henrik
More information about the Xymon
mailing list