<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">

</head>

<body bgcolor="#ffffff" text="#000000">

Daniel J McDonald wrote:<br>

<blockquote cite="mid1136475632.29766.6.camel@localhost.localdomain"

 type="cite">

  <pre wrap="">But having one or two servers that poll all of the others does scale

well, because you don't have to install (and upgrade) hobbit clients on

a hundred machines - just set up an rsa key and you are done.  If the

primary hobbit display/alarm/parse work is too much with the polling

added, just use a second hobbit server for polling/parsing and feed the

results to the display server...

  </pre>

</blockquote>

I disagree. The distributed system scales much better, as the remote

servers are sending in their results in parallel.<br>

While fping is able to test remote hosts in parallel, the other test

are done in serial (bb-fetch, etc).<br>

<br>

Lets say you have 1000 hosts. Lets then just for fun pretend that it

will only take 1 second to log into the remote hosts, run several

tests, and receive the result (it would actually take a bit longer than

that).<br>

<br>

1000 seconds (hosts) / 60 (minutes) = 16.666 minutes to poll those

hosts!<br>

<br>

So then you can say oh well just have 2-3 hobbit servers doing the

polling then.  Now you have 3 hobbit servers to deail with, monitoring

them, upgrading them, etc.<br>

<br>

Now lets look at a <b>real world</b> example of how long it takes to

ssh in and execute a command:<br>

<tt><br>

[hobbit@hobbit ~]$ time ssh myhost.net df -h<br>

Filesystem             size   used  avail capacity  Mounted on<br>

/dev/dsk/c5t0d0s0       30G    10G    20G    35%    /<br>

/devices                 0K     0K     0K     0%    /devices<br>

ctfs                     0K     0K     0K     0%    /system/contract<br>

proc                     0K     0K     0K     0%    /proc<br>

mnttab                   0K     0K     0K     0%    /etc/mnttab<br>

swap                   6.6G  1000K   6.6G     1%    /etc/svc/volatile<br>

objfs                    0K     0K     0K     0%    /system/object<br>

fd                       0K     0K     0K     0%    /dev/fd<br>

swap                   6.6G    16K   6.6G     1%    /tmp<br>

swap                   6.6G    32K   6.6G     1%    /var/run<br>

/dev/md/dsk/d0         639G   116G   523G    19%    /raid<br>

/dev/md/dsk/d1         807G   504G   304G    63%    /raid2<br>

<br>

<b>real    0m1.912s</b><br>

user    0m0.022s<br>

sys     0m0.008s</tt><br>

<br>

Almost 2 seconds there....and just for one command. So now even 2

hobbit servers polling simultaneously will still take over 15 minutes

just to poll 1000 servers. Having hobbit do the ssh's in parallel

wouldn't work either, I have tried something similar on far fewer

hosts, and even using -c blowfish option the server CPU still hit 100%

from all the overhead.<br>

<br>

The way that I get around this is to have bbproxy running on a DMZ

host, and have the hobbit/bb clients configured to use the bbproxy IP

as their BBDISPLAY, whcih then forwards the traffic out of the DMZ to

my hobbit server.  Not 100% secure, but using bb-fetch isn't either (an

attacker could compromise one of the remote servers, and modify one of

the commands that the hobbit user executes, thus giving them the

ability to communicate with the hobbit server, injecting something to

break the parsing engine, buffer overflows, etc). I will stop talking

about that now as I am getting off subject :)<br>

<br>

I agree that having similar functionality to bb-fetch could be useful

for a *few* remote/DMZ hosts, but it certainly doesn't scale well. Once

you reach a number of hosts whose polling time exceeds the hobbit

refresh interval you are done.  I know it would be "nice" if we didn't

have to upgrade remote clients and maintain them, but your solution

involves ssh keys, so just use those same keys and a script to roll out

the updates :)<br>

<br>

-Charles<br>

</body>

</html>