[Xymon] Gaps in graphs

Jeremy Laidman jeremy at laidman.org
Mon Mar 8 23:29:19 CET 2021


On Mon, 8 Mar 2021 at 19:21, Carl Melgaard <Carl.Melgaard at stab.rm.dk> wrote:

> >Are you receiving these "duplicate RRD data" messages every 5 minutes,
> or only occasionally (such as when you're seeing gaps in your graphs)?
>
> >It might be helpful to see one of your graphs with gaps in it.
>
> >Also, can you provide maybe 10 sequential log messages with the
> "duplicate RRD data" in them? I'd like to get a sense of their regularity
> and frequency.
>
>
>
> 2021-03-03 01:24:19.002264 x/netstat.rrd: Bug - duplicate RRD data with
> same timestamp 1614731059, different data
>
> 2021-03-03 02:55:15.002852 x/netstat.rrd: Bug - duplicate RRD data with
> same timestamp 1614736515, different data
>
> 2021-03-04 10:01:17.004140 x/netstat.rrd: Bug - duplicate RRD data with
> same timestamp 1614848477, different data
>
> 2021-03-05 14:15:25.007389 x/netstat.rrd: Bug - duplicate RRD data with
> same timestamp 1614950125, different data
>
> 2021-03-05 14:15:25.007523 x/ifstat.eno16780032.rrd: Bug - duplicate RRD
> data with same timestamp 1614950125, different data
>
> 2021-03-05 22:56:18.014486 x/netstat.rrd: Bug - duplicate RRD data with
> same timestamp 1614981378, different data
>
> 2021-03-05 22:56:18.015006 x/ifstat.eno16780032.rrd: Bug - duplicate RRD
> data with same timestamp 1614981378, different data
>
> 2021-03-06 12:30:28.002023 x/netstat.rrd: Bug - duplicate RRD data with
> same timestamp 1615030228, different data
>
> 2021-03-06 12:30:28.002952 x/ifstat.eno16780032.rrd: Bug - duplicate RRD
> data with same timestamp 1615030228, different data
>

Interesting. It seems to be a rare occurrence - no more than two duplicate
data points in a day - almost too few to notice. Are your gaps more than 5
minutes (one sample) long? It might be helpful for you to include an
example gappy graph for us to see.

Some of these errors relate to netstat and others to ifstat processing.
Both parsers receive data from the same client data message. Interestingly,
only some of the errors for netstat.rrd coincide with ones for ifstat.rrd.
The matching timestamps means this is unlikely to be a coincidence, but I'm
not sure what to make of it TBH.

>One last thing to look at. Are the gaps actual missing data points, or are
> they values of zero? The way to tell this is to dump the RRD file's
> contents using something like "rrdtool fetch netstat.rrd AVERAGE | tail
> -100" (or "less rather than tail -100) and look for either zero or low
> numbers, >or NaN (not a number) entries. [Note that the last few are
> usually NaN because they're still waiting for updates, so you can ignore
> those.]
>
>
>
> Currently I cant actually find a graph with a gap in it. I just noticed
> because it happened on the Xymon server itself. On my old setup, it never
> happened.
>

OK. I think your best bet to diagnose is going to be correlating log
messages or other events to the gaps.

You mentioned an "old setup". Can you describe what has changed from old to
new setup? Have you upgraded hardware/OS/Xymon server/Xymon client(s)?

You said that you noticed on the Xymon server itself. Has it only happened
to graphs for the Xymon server? I'm wondering if you have the Xymon client
AND the Xymon server both running on the same host?


> Also in xymonclient.log I get these quite alot, dunno if its related:
>
>
>
> mv: cannot stat '/dev/shm/logfetch.x.cfg.tmp': No such file or directory
>
> cat: /dev/shm/xymon_vmstat.x: No such file or directory
>
> cat: /dev/shm/xymon_vmstat.x: No such file or directory
>

Can you explain "quite alot"? Can you give an indication of how often these
occur?

This might very well be related. The logfetch and vmstat files are created
during the construction of the client data message. It's likely that some,
if not all, of the client data message will be missing, when these logs
show up.

I'd be trying to correlate these log messages with the times that you get
gaps in your graphs. If they match, then it looks to be a problem with the
Xymon client.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20210309/c025963d/attachment.htm>


More information about the Xymon mailing list