[Xymon] Gaps in graphs

Jeremy Laidman jeremy at laidman.org
Sat Mar 6 08:31:10 CET 2021


On Sat, 6 Mar 2021 at 02:04, Scot Kreienkamp <Scot.Kreienkamp at la-z-boy.com>
wrote:

> 2021-03-03 01:24:19.002264 xxx/netstat.rrd: Bug - duplicate RRD data with
> same timestamp 1614731059, different data
>
>
>
> That usually happens because graphs by default store data once every 5
> minutes.  However, if it receives data more often, say every minute, then
> it can’t store that in the RRD because it can only store one data point
> every 5 minutes.  Since it’s receiving data more often than once in the 5
> minute window the RRD backend triggers that message.
>

Yes, what Scott said. This could mean you have two different sources of
both client data messages and status messages, as if you have two copies of
the Xymon client running on the host being monitored.

However, duplicate messages would, not in itself, lead to missing data. It
would only cause the extra data to be dropped, but the first data point in
a 5-minute window would be accepted, and no gaps in the data. The only way
I can see this being a symptom of a problem that also causes gaps in
graphs, is if the clock on the host is jittering wildly (such as if it was
a VM on a heavily-loaded host server) and causing some sequential messages
to arrive at the Xymon server too close together. This is quite unlikely,
so I'm not sure this is related to the gaps. Instead, you might just have
two problems to solve: duplicate data sources, and an as-yet unexplained
gaps in your graphs.

Are you receiving these "duplicate RRD data" messages every 5 minutes, or
only occasionally (such as when you're seeing gaps in your graphs)?

It might be helpful to see one of your graphs with gaps in it.

Also, can you provide maybe 10 sequential log messages with the "duplicate
RRD data" in them? I'd like to get a sense of their regularity and
frequency.

One last thing to look at. Are the gaps actual missing data points, or are
they values of zero? The way to tell this is to dump the RRD file's
contents using something like "rrdtool fetch netstat.rrd AVERAGE | tail
-100" (or "less rather than tail -100) and look for either zero or low
numbers, or NaN (not a number) entries. [Note that the last few are usually
NaN because they're still waiting for updates, so you can ignore those.]

Cheers
Jeremy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20210306/e95d9cef/attachment.htm>


More information about the Xymon mailing list