[Xymon] Critical System Page -- HTTP 500 Error

EDSchminke at Hormel.com EDSchminke at Hormel.com
Mon Aug 28 23:00:23 CEST 2017


I have another update to this issue.  I think I can pin point the problem a
little more specifically.  In my previous message, I blamed the "uptime"
test for causing the Critical Systems Page to crash.  After this weekend, I
discovered that there's a little more to it than that.

This weekend, the Critical Systems Page crashed due to a "disk" test being
non-green. After picking random hosts/test with varying degrees of success,
I finally noticed that isn't necessarily "disk" or "uptime" that causes it;
rather whichever test is the LAST test defined for a host (or cloned host).
It just so happens, that "uptime" usually ends up being the LAST test
defined for most of my hosts since the Critical Systems Page Editor sorts
them as it gets written.

To test, I made "disk" the last (only) test defined for a host.  I would
then modify thresholds for memory, disk and procs to put the tests into a
non-green state.  I set the "monitoring time" window 11:58PM to 11:59PM.
First, disk crashed the page.  I then duplicated the "disk" entry to
"memory", making that the last test for that host.  Disk no longer crashed
the page, but when I put memory into a non-green state, it would crash.  I
then made "procs" the last test for the host.  Memory no longer crashed the
page, but procs would after putting that test into a non-green state.

So in short, the test conditions.
- Current time OUTSIDE monitoring time window.
- Target test in a non-green state
- Target test is the last (or only) test defined for a given host.

I imagine this must be caused by something running off the end of the loop.


Erik D. Schminke | Associate Systems Programmer
Hormel Foods Corporation | One Hormel Place | Austin, MN 55912
Phone: (507) 434-6817
edschminke at hormel.com | www.hormelfoods.com





More information about the Xymon mailing list