[hobbit] bbgen frequent yellow alerts - hobbitd problem?
Mr-Pope
pope8086 at gmail.com
Mon Nov 6 19:27:24 CET 2006
Thanks, Henrik.
I made the changes that you suggested and copied bbgen to the
appropriate directory. When we get yellow alerts the following is the
new "Error output":
hobbitd status-board not available, code 0
In the log I get the following messages during the connection process:
*cut*
2006-11-06 09:59:57 load_state()
2006-11-06 09:59:57 Transport setup is:
2006-11-06 09:59:57 bbdportnumber = 1984
2006-11-06 09:59:57 bbdispproxyhost = NONE
2006-11-06 09:59:57 bbdispproxyport = 0
2006-11-06 09:59:57 Recipient listed as '10.xxx.xxx.xxx'
2006-11-06 09:59:57 Standard BB protocol on port 1984
2006-11-06 09:59:57 Will connect to address 10.xxx.xxx.xxx port 1984
2006-11-06 09:59:57 Connect status is 0
2006-11-06 09:59:57 Sent 126 bytes
2006-11-06 09:59:57 recv() gave us 0 bytes
2006-11-06 09:59:57 Closing connection
*cut*
2006-11-06 09:59:58 Recipient listed as '10.xxx.xxx.xxx'
2006-11-06 09:59:58 Standard BB protocol on port 1984
2006-11-06 09:59:58 Will connect to address 10.xxx.xxx.xxx port 1984
2006-11-06 09:59:58 Connect status is 0
2006-11-06 09:59:58 Sent 1384 bytes
2006-11-06 09:59:58 Closing connection
2006-11-06 09:59:58 1 status messages merged into 2 transmissions
*end*
I did not see the impact of the changes to sendmsg.c anywhere in the
debug output.
-Jon
On 11/6/06, Henrik Stoerner <henrik at hswn.dk> wrote:
> On Mon, Nov 06, 2006 at 07:35:27AM -0800, Mr-Pope wrote:
>
> > We are running a new installation of Hobbit 4.2 on Solaris 10 running
> > in a non-global zone. Server is a v240 but I don't think that matters
> > here.
> >
> > The problem here is that our bbgen status turns yellow with fairly
> > high frequency, sometimes multiple times an hour, at (what seem like)
> > random intervals. In the yellow alert bbgen reports:
> > "hobbitd status-board not available"
>
> The reports I've had of this only have one thing in common: They all
> happen on Solaris 10. So I'm beginning to suspect that maybe Solaris
> doesn't work quite the way other systems do.
>
> Or perhaps there is a bug, and something special in Solaris triggers it.
>
> > Below are the output from some commands/logs. These logs don't really
> > seem to help, so let me know if there is anything else that I can send
> > along to debug this issue.
>
> > $BB --debug $BBDISP "hobbitdboard"
> > (with no --debug on a 'failure' I get no output. I'm assuming this is
> > the same cause of the bbgen yellow alert)
>
> Yes.
>
> > bbgen --debug --report (this one turned bbgen yellow/unavailable.
> > Note the quick disconnect.)
> > 2006-11-03 09:51:03 load_state()
> > 2006-11-03 09:51:03 Transport setup is:
> > 2006-11-03 09:51:03 bbdportnumber = 1984
> > 2006-11-03 09:51:03 bbdispproxyhost = NONE
> > 2006-11-03 09:51:03 bbdispproxyport = 0
> > 2006-11-03 09:51:03 Recipient listed as '10.xxx.xxx.xxx'
> > 2006-11-03 09:51:03 Standard BB protocol on port 1984
> > 2006-11-03 09:51:03 Will connect to address 10.xxx.xxx.xxx port 1984
> > 2006-11-03 09:51:03 Connect status is 0
> > 2006-11-03 09:51:03 Sent 126 bytes
> > 2006-11-03 09:51:03 Closing connection
>
> Interesting.
>
> Since it seems that this bites you more than most others, I'd like you
> to do a couple of things for me to figure out what is going on. I need
> you to add a couple of debugging lines to Hobbit.
>
> First, in the bbdisplay/loaddata.c file, around line 436 you'll find the
> code that prints out the "hobbitd status board not available" message.
> It looks like this:
> errprintf("hobbitd status-board not available\n");
> I want you to change that to
> errprintf("hobbitd status-board not available, code %d\n", hobbitdresult);
>
>
> Next, in the lib/sendmsg.c file around line 340 is where the code is
> that receives data from Hobbit. You'll find these lines:
>
> n = recv(sockfd, recvbuf, sizeof(recvbuf)-1, 0);
> if (n > 0) {
>
> I'd like you to add 8 lines between these two:
>
> n = recv(sockfd, recvbuf, sizeof(recvbuf)-1, 0);
> if (n < 0) {
> dbgprintf("recv() returned error: %s\n", strerror(errno));
> if (errno == EAGAIN) continue;
> }
> if (n == 0) {
> dbgprintf("recv() gave us 0 bytes\n");
> continue;
> }
> if (n > 0) {
>
> (it isn't the prettiest of programming, but it does the job for now).
>
>
> After making these two changes, run "make clean; make" and copy the
> bbdisplay/bbgen binary into your ~hobbit/server/bin/ directory. Let
> Hobbit run as normal (with --debug on the bbgen command) and when it
> fails I am very interested to see what's in the logfile.
>
>
> Regards,
> Henrik
>
>
> To unsubscribe from the hobbit list, send an e-mail to
> hobbit-unsubscribe at hswn.dk
>
>
>
More information about the Xymon
mailing list