[Xymon] Spurious purple messages

Colin Coe colin.coe at gmail.com
Thu Sep 17 00:28:19 CEST 2015


Glauber, I can confirm there are no cron jobs or similar that alter the time.

Phil, I can confirm that it is a false positive.

I figure there must be some stale data somewhere but I've not found
it.   What process sends the notifications?  Where does this process
get its data?

Thanks all

On Wed, Sep 16, 2015 at 10:01 PM, Ribeiro, Glauber
<glauber.ribeiro at experian.com> wrote:
> Sorry, I wasn't clear. I was wondering if there could be some process set up in cron to adjust the time, which could be causing this (bumping the server time once a day). Just hypothetical, unlikely.
>
> g
>
> -----Original Message-----
> From: Colin Coe [mailto:colin.coe at gmail.com]
> Sent: Wednesday, September 16, 2015 01:26
> To: Ribeiro, Glauber
> Cc: Vernon Everett; xymon at xymon.com
> Subject: Re: [Xymon] Spurious purple messages
>
> Hi all
>
> The date/time is set correctly:
> ---
> timedatectl
>       Local time: Wed 2015-09-16 14:23:45 AWST
>   Universal time: Wed 2015-09-16 06:23:45 UTC
>         RTC time: Wed 2015-09-16 06:23:42
>         Timezone: Australia/Perth (AWST, +0800)
>      NTP enabled: yes
> NTP synchronized: yes
>  RTC in local TZ: no
>       DST active: n/a
> ---
>
> fping responds with "host is alive", ping responds with "normal" ping
> successful output.
>
>
> Anyone else have any ideas on this, I really don't want to have to
> blow this server away and start again...
>
> Thanks
>
> On Tue, Sep 15, 2015 at 11:44 PM, Ribeiro, Glauber
> <glauber.ribeiro at experian.com> wrote:
>> Could it be something with the clock on the xymon server? Maybe some cron process to synchronize to a time server?
>>
>> -----Original Message-----
>> From: Xymon [mailto:xymon-bounces at xymon.com] On Behalf Of Colin Coe
>> Sent: Monday, September 14, 2015 22:29
>> To: Vernon Everett
>> Cc: xymon at xymon.com
>> Subject: Re: [Xymon] Spurious purple messages
>>
>> Hi Vernon,
>>
>> Yep, very interesting.  The purple messages come through every day at
>> about the same time, give or take a minute or so.
>>
>> Yep, pings work and the normal "main view" and "all non-green view" works fine.
>>
>> The logs look fine.  I'd really like to get to the bottom of this...
>>
>> Thanks
>>
>> CC
>>
>> On Tue, Sep 15, 2015 at 10:06 AM, Vernon Everett
>> <everett.vernon at gmail.com> wrote:
>>> That's interesting.
>>> No idea what it means, or where to go from here, but it's certainly
>>> interesting.
>>>
>>> Does it happen the exact same time every day?
>>> Have you tried a ping from the Xymon host to the client at or around the
>>> time of the issue? See if there's any oddities?
>>>
>>> Is there anything in the logs?
>>>
>>>
>>> On 14 September 2015 at 15:17, Colin Coe <colin.coe at gmail.com> wrote:
>>>>
>>>> OK, looking at this again.  The main view looks fine, but the 'conn'
>>>> test on every host is a yellow circle with a question mark (unknown)
>>>> in the snapshot report view since September 4, 2015 at 13:32:42.
>>>>
>>>> September 4, 2015 at 13:32:41 and earlier look fine.
>>>>
>>>> Thanks
>>>>
>>>> On Sat, Sep 12, 2015 at 5:48 PM, Vernon Everett
>>>> <everett.vernon at gmail.com> wrote:
>>>> > Good to know it's not just me that fights with SELinux. :-)
>>>> >
>>>> > Now that it works, what does the snapshot report reveal at the time the
>>>> > purple alerts go out?
>>>> >
>>>> > Purples require a "no report" for 30 minutes to trigger.
>>>> > You might want to check all your logs at around 30-35 minutes before the
>>>> > emails.
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On 11 September 2015 at 18:13, Colin Coe <colin.coe at gmail.com> wrote:
>>>> >>
>>>> >> Almost...
>>>> >>
>>>> >> Turned out to be SELinux, my old nemesis.  :)
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Tue, Sep 8, 2015 at 5:37 PM, Vernon Everett
>>>> >> <everett.vernon at gmail.com>
>>>> >> wrote:
>>>> >> > That might be a permissions thing.
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> > On 8 September 2015 at 19:15, Colin Coe <colin.coe at gmail.com> wrote:
>>>> >> >>
>>>> >> >> Hi Vernon
>>>> >> >>
>>>> >> >> Thanks for the really good info.  The message serial numbers are
>>>> >> >> different every day but the messages are sent at the same time
>>>> >> >> (13:45)
>>>> >> >> daily for all tests on all hosts.
>>>> >> >>
>>>> >> >> The network is not congested nor is the SAN under any kind of
>>>> >> >> pressure.
>>>> >> >>
>>>> >> >> Interestingly, trying to do the snapshot report gave me "Cannot
>>>> >> >> create
>>>> >> >> output directory".
>>>> >> >>
>>>> >> >> Thanks again
>>>> >> >>
>>>> >> >> CC
>>>> >> >>
>>>> >> >> On Tue, Sep 8, 2015 at 3:56 PM, Vernon Everett
>>>> >> >> <everett.vernon at gmail.com>
>>>> >> >> wrote:
>>>> >> >> > Hi Colin
>>>> >> >> >
>>>> >> >> > What do the client hosts share in common?
>>>> >> >> > I have seen in the past, a client was overloading their storage
>>>> >> >> > system,
>>>> >> >> > and
>>>> >> >> > were overflowing buffers and exceeding the storage array's ability
>>>> >> >> > to
>>>> >> >> > process IO requests. Of course this caused a general disk latency,
>>>> >> >> > which
>>>> >> >> > slowed things down to the point of a purple flood.
>>>> >> >> > Was no simple solution to that one, except buy more storage, which
>>>> >> >> > they
>>>> >> >> > did.
>>>> >> >> >
>>>> >> >> > Also, check the "serial numbers" on the messages. Is this a repeat
>>>> >> >> > of
>>>> >> >> > an
>>>> >> >> > older message - in which case Xymon might have something fishy
>>>> >> >> > going
>>>> >> >> > on,
>>>> >> >> > or
>>>> >> >> > are they new messages every day, as in it really thinks there is a
>>>> >> >> > problem.
>>>> >> >> >
>>>> >> >> > Xymon only updates pages every 2 and 5 minutes, depending on the
>>>> >> >> > page
>>>> >> >> > you
>>>> >> >> > are looking at. Meaning you could wait up to 7 minutes for the
>>>> >> >> > real
>>>> >> >> > status
>>>> >> >> > to appear.
>>>> >> >> > A purple takes 30 minutes to trigger.
>>>> >> >> > With some unfortunate, and highly improbable timing on whatever is
>>>> >> >> > triggering these events, it's possible you might not see the
>>>> >> >> > purple.
>>>> >> >> > Have you pulled up a "snapshot report" for the exact time of the
>>>> >> >> > messages?
>>>> >> >> >
>>>> >> >> > Something else unlikely, but possible, is the network.
>>>> >> >> > The conn test used ping, which is UDP
>>>> >> >> > The Xymon agent sends using TCP.
>>>> >> >> > Is there anything interesting happening on the network at the
>>>> >> >> > time?
>>>> >> >> >
>>>> >> >> > Regards
>>>> >> >> > Vernon
>>>> >> >> >
>>>> >> >> >
>>>> >> >> >
>>>> >> >> > On 8 September 2015 at 11:39, Colin Coe <colin.coe at gmail.com>
>>>> >> >> > wrote:
>>>> >> >> >>
>>>> >> >> >> Hi all
>>>> >> >> >>
>>>> >> >> >> Since Friday September 4, I've started receiving "stopped
>>>> >> >> >> reporting
>>>> >> >> >> (PURPLE)" messages for all tests on all hosts from one of our
>>>> >> >> >> Xymon
>>>> >> >> >> servers.
>>>> >> >> >>
>>>> >> >> >> The host status, as shown in the Main View, is green for all
>>>> >> >> >> hosts
>>>> >> >> >> and
>>>> >> >> >> tests.  No purple at all.
>>>> >> >> >>
>>>> >> >> >> The "stopped reporting (PURPLE)" messages are being sent at the
>>>> >> >> >> same
>>>> >> >> >> time every day, 1:45PM.
>>>> >> >> >>
>>>> >> >> >> Any advise on how I should track this down?
>>>> >> >> >>
>>>> >> >> >> Thanks
>>>> >> >> >> _______________________________________________
>>>> >> >> >> Xymon mailing list
>>>> >> >> >> Xymon at xymon.com
>>>> >> >> >> http://lists.xymon.com/mailman/listinfo/xymon
>>>> >> >> >
>>>> >> >> >
>>>> >> >> >
>>>> >> >> >
>>>> >> >> > --
>>>> >> >> > "Accept the challenges so that you can feel the exhilaration of
>>>> >> >> > victory"
>>>> >> >> > - General George Patton
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> > --
>>>> >> > "Accept the challenges so that you can feel the exhilaration of
>>>> >> > victory"
>>>> >> > - General George Patton
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > "Accept the challenges so that you can feel the exhilaration of victory"
>>>> > - General George Patton
>>>
>>>
>>>
>>>
>>> --
>>> "Accept the challenges so that you can feel the exhilaration of victory"
>>> - General George Patton
>> _______________________________________________
>> Xymon mailing list
>> Xymon at xymon.com
>> http://lists.xymon.com/mailman/listinfo/xymon



More information about the Xymon mailing list