[hobbit] Client interval question

Thu Dec 15 09:16:22 CET 2005

First off, I know I can come off terse in e-mail, but they are not  
personal attacks.

> It can be a bad idea sometimes, others not (for example, the reply  
> from
> the person catching intermittant problems with BB running every  
> minute)
>

Who ended up stating  the anomaly *was* detected in 5m intervals, but  
only once every 13h instead of every hour.   But I still don't  
understand how it will help *you*.

>
> A smaller sampling period can show things in a more granular  
> aspect. For example, a process kicks off and 5 minutes later you  
> see 100 errors (im keeping things generic for illustrative  
> purposes) Were those 100 errors in the first minute? the last?  
> constantly throughout the 5 minutes?

The 5m averages over a week would be quite low compared so a single  
5m plot.  From that, one could extrapolate in the last 5m things have  
not been 'normal'.

>
> Im not saying your wrong, simply pointing out that it's not as  
> black and white as your making it.
>

And I am disagreeing with you ;)  I've been watching the data in  
these graphs for many many years now, and I have yet to come across a  
situation where having a 1m sampling/graphing period would have  
helped me fix/improve something . . .

It's like a story problem with too much information, it makes coming  
up with the real answer harder in the end.  Most people don't have  
time/enegry/brains to be able to sift all the data correctly.   If if  
they do, the 5m samples are good enough.

Most people (including really smart people that are forgetful) can't  
deal with an auto-scaling y-axis.

> Something being just interesting initially can sometimes uncover  
> problems that
> you didn't see before.
>

Like I said, if you have job were interesting is worthwhile,  
wonderful.  In my experience, most folks that are running the BB/ 
hobbit tools are involved in the operational aspects of  
infrastructure, not R&D.

>
> > With the stock larrd/hobbit RRD definitions you are correct.  He'll
> > only use one of the five, and whine about the timestamp of the other
> > four.
>
> Firstly, can you explain your comment in more detail?

RRD interpolates Time Series Data to put a value at a fixed  
interval.  That is why you hardly ever see integers in the data.  If  
you sample comes in at 299s, RRD interpolates what that value to what  
would have been at 300s.  How this is done can be tuned.  The default  
settings with the RRAs expect data to happen every 300s.  RRD will  
only insert data one time within that interval.

> Secondly,
> im confused as to why you would state that I would "whine" about  
> anything
> when you have no basis for a conclusion to that effect. It seems to  
> be a rather
> pointed comment in a discussion that hasn't involved the use of  
> language that
> would dictate a response like that.
>

"He'll whine" meant rrdtool, not you:

ERROR: illegal attempt to update using time 1042731000 when last  
update time
 > is 1043099100 (minimum one second step)

That's whining in my book.  Sorry you thought I was speaking about you.
> That is a very good point you make. There is a difference between
> real-time analysis and capacity planning/trending. I don't however  
> think
> that it is that far outside of hobbit's scope to try and leverage  
> it for
> a more pointed analysis.

 From a software development standpoint there is a lot to be said  
for: "Do one thing and do it well".  If architecting the RRD  
framework for RTA breaks trending, bad idea.

> My goal isn't to take every machine in my environment
> and make them into 1 minute sampling period machines. To have the  
> ability to do
> so on a machine-by-machine basis could be useful
>

Which is why I proposed another client collector for this activity.

>
> > That's my design you inherited and because of the complexity of the
> > parts, I think it is a very solid design.
>
> I don't think anyone is really questioning that.
>

You are questioning that.  And that is fine.  I don't take it  
personally you think there may be a better way.  I know my way may  
not be the best, but I sure know exactly *why* I chose it.

> Honestly, I don't claim to know anything about the way larrd and  
> hobbit
> are coded in the slightest. There are difficulties to be sure, but  
> part of having a
> community such as this is to foster ideas and innovation. Just  
> because you
> don't think it's useful or that it's hard doesn't mean the same is  
> true for everyone out
> there.

Ahhhhh, to the heart of the matter.   Don't suggest ideas in a public  
forum if you are not prepared to defend them.  Fostering ideas comes  
from intelligent discussions.  I merely wanted to understand why you  
felt you needed a higher sampling rate from a business perspective.

scott
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20051215/740a13f3/attachment.html>