[Xymon] Spurious purple messages

Colin Coe colin.coe at gmail.com
Fri Sep 11 10:13:29 CEST 2015


Almost...

Turned out to be SELinux, my old nemesis.  :)



On Tue, Sep 8, 2015 at 5:37 PM, Vernon Everett <everett.vernon at gmail.com> wrote:
> That might be a permissions thing.
>
>
>
> On 8 September 2015 at 19:15, Colin Coe <colin.coe at gmail.com> wrote:
>>
>> Hi Vernon
>>
>> Thanks for the really good info.  The message serial numbers are
>> different every day but the messages are sent at the same time (13:45)
>> daily for all tests on all hosts.
>>
>> The network is not congested nor is the SAN under any kind of pressure.
>>
>> Interestingly, trying to do the snapshot report gave me "Cannot create
>> output directory".
>>
>> Thanks again
>>
>> CC
>>
>> On Tue, Sep 8, 2015 at 3:56 PM, Vernon Everett <everett.vernon at gmail.com>
>> wrote:
>> > Hi Colin
>> >
>> > What do the client hosts share in common?
>> > I have seen in the past, a client was overloading their storage system,
>> > and
>> > were overflowing buffers and exceeding the storage array's ability to
>> > process IO requests. Of course this caused a general disk latency, which
>> > slowed things down to the point of a purple flood.
>> > Was no simple solution to that one, except buy more storage, which they
>> > did.
>> >
>> > Also, check the "serial numbers" on the messages. Is this a repeat of an
>> > older message - in which case Xymon might have something fishy going on,
>> > or
>> > are they new messages every day, as in it really thinks there is a
>> > problem.
>> >
>> > Xymon only updates pages every 2 and 5 minutes, depending on the page
>> > you
>> > are looking at. Meaning you could wait up to 7 minutes for the real
>> > status
>> > to appear.
>> > A purple takes 30 minutes to trigger.
>> > With some unfortunate, and highly improbable timing on whatever is
>> > triggering these events, it's possible you might not see the purple.
>> > Have you pulled up a "snapshot report" for the exact time of the
>> > messages?
>> >
>> > Something else unlikely, but possible, is the network.
>> > The conn test used ping, which is UDP
>> > The Xymon agent sends using TCP.
>> > Is there anything interesting happening on the network at the time?
>> >
>> > Regards
>> > Vernon
>> >
>> >
>> >
>> > On 8 September 2015 at 11:39, Colin Coe <colin.coe at gmail.com> wrote:
>> >>
>> >> Hi all
>> >>
>> >> Since Friday September 4, I've started receiving "stopped reporting
>> >> (PURPLE)" messages for all tests on all hosts from one of our Xymon
>> >> servers.
>> >>
>> >> The host status, as shown in the Main View, is green for all hosts and
>> >> tests.  No purple at all.
>> >>
>> >> The "stopped reporting (PURPLE)" messages are being sent at the same
>> >> time every day, 1:45PM.
>> >>
>> >> Any advise on how I should track this down?
>> >>
>> >> Thanks
>> >> _______________________________________________
>> >> Xymon mailing list
>> >> Xymon at xymon.com
>> >> http://lists.xymon.com/mailman/listinfo/xymon
>> >
>> >
>> >
>> >
>> > --
>> > "Accept the challenges so that you can feel the exhilaration of victory"
>> > - General George Patton
>
>
>
>
> --
> "Accept the challenges so that you can feel the exhilaration of victory"
> - General George Patton



More information about the Xymon mailing list