[Xymon] FW: Troubleshooting Purple CONN and HTTP Tests in Xymon 4.3.10
Larry Barber
lebarber at gmail.com
Wed Nov 7 00:17:54 CET 2012
Did your purples clear up? It can a couple of minutes sometimes, depending
on how often you regen your web pages.
A quick hack to keep them from coming back would be to add a MAXTIME to the
xymonnet stanza in tasks.cfg.
I'm not really sure what else to tell you. If the process hangs again you
might try to "kill -6" it and send the resulting core dump to Henrik.
Thanks,
Larry Barber
On Tue, Nov 6, 2012 at 9:15 AM, Don Kuhlman <Don.Kuhlman at schawk.com> wrote:
> HI Larry/all. Sorry I didn't post the last reply to the list.
>
> Update – Larry suggested looking for a hung xymonnet process – found
> one. Killed that.
> Changed tasks.cfg to add —debug and am now getting log updates in
> xymonnet.log
> Looked for another xymonnet process and don't see any.
> The web pages are still showing purple on the CONN, HTTP, and the XYMONNET
> status is also purple.
>
> Thanks for your help Larry.
>
> Any further suggestions as to what to look for in the log or elsewhere
> that may indicate the problem?
>
> Don K
> From: Larry Barber <lebarber at gmail.com>
> Date: Tue, 6 Nov 2012 08:47:55 -0600
>
> To: Don Kuhlman <don.kuhlman at schawk.com>
> Cc: Xymon Email List <xymon at xymon.com>
> Subject: Re: [Xymon] FW: Troubleshooting Purple CONN and HTTP Tests in
> Xymon 4.3.10
>
> Did you check to see if a xymonnet process is/was still running? If a
> process gets hung for some reason xymonlaunch won't start a new process. I
> had this happen to me once, but only once. There is also a --debug flag for
> xymonnet, but it produces a _lot_ of output, but it might give you some
> idea what is going on.
>
> Thanks,
> Larry Barber
>
> On Tue, Nov 6, 2012 at 8:02 AM, Don Kuhlman <Don.Kuhlman at schawk.com>wrote:
>
>> Thanks Larry. Looks like everything went purple again at 6:45 this
>> morning. The logs still show 0 bytes.
>> Any other suggestions for trying to figure this out?
>>
>> Regards,
>>
>> Don
>>
>> From: Larry Barber <lebarber at gmail.com>
>> Date: Mon, 5 Nov 2012 17:19:53 -0600
>> To: Don Kuhlman <don.kuhlman at schawk.com>
>>
>> Subject: Re: [Xymon] FW: Troubleshooting Purple CONN and HTTP Tests in
>> Xymon 4.3.10
>>
>> Xymonnet tends to be pretty quiet unless something goes wrong. You won't
>> be able to tell for sure until you get one of your purple storms.
>>
>> Alerts are handled by a different module. Look in tasks.cfg to find it.
>>
>> Thanks,
>> Larry Barber
>>
>> On Mon, Nov 5, 2012 at 3:53 PM, Don Kuhlman <Don.Kuhlman at schawk.com>wrote:
>>
>>> Hi Larry/all. I've noticed that the xymonnet.log and
>>> xymonnet-again.log files are staying at 0 bytes. Does that seem to be
>>> indicating a problem?
>>> (and Xymon hasn't gone purple all day, but I'm still not sending any
>>> email alerts to anyone).
>>>
>>> -rw-rw-rw- 1 xymon xymon 0 Nov 5 15:05
>>> /var/log/xymon/xymonnet-again.log
>>> -rw-rw-rw- 1 xymon xymon 0 Nov 5 15:07
>>> /var/log/xymon/xymonnet.log
>>>
>>> Thanks
>>>
>>> Don K
>>>
>>>
>>>
>>> From: Larry Barber <lebarber at gmail.com>
>>> Date: Mon, 5 Nov 2012 11:19:32 -0600
>>> To: Don Kuhlman <don.kuhlman at schawk.com>
>>> Cc: Xymon Email List <xymon at xymon.com>
>>> Subject: Re: [Xymon] FW: Troubleshooting Purple CONN and HTTP Tests in
>>> Xymon 4.3.10
>>>
>>> All the server side Xymon logs are in /var/log/xymon by default. Since
>>> you say that you are getting purple storms for conn and http tests, this
>>> suggests that the problem is likely with your xymonnet process. Check the
>>> xymonnet log, and when you see the purples check to see if there is a
>>> xymonnet instance running. If this instance has been running for more than
>>> a few minutes, kill it. If the xymonnet process is hanging, you might want
>>> to set the MAXTIME parameter on the xymonnet process in tasks.cfg. Doesn't
>>> really fix the problem, but it will at least stop things from going
>>> purple.
>>>
>>> Thanks,
>>> Larry Barber
>>>
>>> On Mon, Nov 5, 2012 at 10:01 AM, Don Kuhlman <Don.Kuhlman at schawk.com>wrote:
>>>
>>>> Update to this. While googling further, I saw a thread titled
>>>> "[hobbit] stale alerts". This mentioned that there could be an external
>>>> script that I created which may cause issues for xymon when it runs. I do
>>>> have a diskstat.sh script that may be causing problems. For now, I'm
>>>> setting it to DISABLED in the tasks.cfg file.
>>>>
>>>> Is there a way to see log information in xymon to try and verify
>>>> something like this?
>>>>
>>>> Thanks
>>>>
>>>> Don K
>>>>
>>>> From: Don Kuhlman <don.kuhlman at schawk.com>
>>>> Date: Mon, 5 Nov 2012 08:34:29 -0600
>>>> To: Xymon Email List <xymon at xymon.com>
>>>> Subject: Troubleshooting Purple CONN and HTTP Tests in Xymon 4.3.10
>>>>
>>>> Hi folks. We've been running xymon for about 10 months now. It's
>>>> been fine all this time.
>>>>
>>>> However last week around Wednesday we started getting purple storms
>>>> on the CONN and HTTP tests for all our hosts.
>>>> I stop Xymon and restart it, or reboot the server (Linux 5.x) and then
>>>> it comes back ok.
>>>> This also happened Thursday, and then again Saturday around 2PM cst.
>>>>
>>>> Anyone have a link or source for which logs to look in on the server
>>>> or xymon to see what may be causing the CONN and HTTP tests to randomly
>>>> start failing like this or where to start troubleshooting?
>>>>
>>>> Can I use xymonlaunch —debug like this to see what is happening?
>>>> /usr/lib64/xymon/server/bin/xymonlaunch --debug
>>>> --config=/usr/lib64/xymon/server/etc/tasks.cfg
>>>> --env=/usr/lib64/etc/xymonserver.cfg
>>>>
>>>>
>>>>
>>>> While searching the xymon forum and message boards, I saw some things
>>>> that say it may be disk space or inodes, but it seems like we are ok there -
>>>> df -i
>>>> Filesystem Inodes IUsed IFree IUse% Mounted on
>>>> /dev/sda2 3899392 204731 3694661 6% /
>>>> tmpfs 490139 6 490133 1% /dev/shm
>>>> /dev/sda1 32768 51 32717 1% /boot
>>>>
>>>> df
>>>> Filesystem 1K-blocks Used Available Use% Mounted on
>>>> /dev/sda2 61312028 5748784 52448700 10% /
>>>> tmpfs 1960556 188 1960368 1% /dev/shm
>>>> /dev/sda1 516040 87716 402112 18% /boot
>>>>
>>>> DNS also seems fine.
>>>>
>>>> Thanks
>>>>
>>>> Don K
>>>>
>>>> _______________________________________________
>>>> Xymon mailing list
>>>> Xymon at xymon.com
>>>> http://lists.xymon.com/mailman/listinfo/xymon
>>>>
>>>>
>>>
>>
>> _______________________________________________
>> Xymon mailing list
>> Xymon at xymon.com
>> http://lists.xymon.com/mailman/listinfo/xymon
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20121106/8e409748/attachment.html>
More information about the Xymon
mailing list