[Xymon] FW: Troubleshooting Purple CONN and HTTP Tests in Xymon 4.3.10

Larry Barber lebarber at gmail.com
Wed Nov 7 00:17:54 CET 2012


Did your purples clear up? It can a couple of minutes sometimes, depending
on how often you regen your web pages.

A quick hack to keep them from coming back would be to add a MAXTIME to the
xymonnet stanza in tasks.cfg.

I'm not really sure what else to tell you. If the process hangs again you
might try to "kill -6" it and send the resulting core dump to Henrik.

Thanks,
Larry Barber

On Tue, Nov 6, 2012 at 9:15 AM, Don Kuhlman <Don.Kuhlman at schawk.com> wrote:

>  HI Larry/all. Sorry I didn't post the last reply to the list.
>
>  Update – Larry suggested looking for a hung xymonnet process – found
> one. Killed that.
> Changed tasks.cfg to add —debug and am now getting log updates in
> xymonnet.log
> Looked for another xymonnet process and don't see any.
> The web pages are still showing purple on the CONN, HTTP, and the XYMONNET
> status is also purple.
>
>  Thanks for your help Larry.
>
>  Any further suggestions as to what to look for in the log or elsewhere
> that may indicate the problem?
>
>  Don K
>  From: Larry Barber <lebarber at gmail.com>
> Date: Tue, 6 Nov 2012 08:47:55 -0600
>
> To: Don Kuhlman <don.kuhlman at schawk.com>
> Cc: Xymon Email List <xymon at xymon.com>
> Subject: Re: [Xymon] FW: Troubleshooting Purple CONN and HTTP Tests in
> Xymon 4.3.10
>
>  Did you check to see if a xymonnet process is/was still running? If a
> process gets hung for some reason xymonlaunch won't start a new process. I
> had this happen to me once, but only once. There is also a --debug flag for
> xymonnet, but it produces a _lot_ of output, but it might give you some
> idea what is going on.
>
>  Thanks,
> Larry Barber
>
> On Tue, Nov 6, 2012 at 8:02 AM, Don Kuhlman <Don.Kuhlman at schawk.com>wrote:
>
>>  Thanks Larry. Looks like everything went purple again at 6:45 this
>> morning.  The logs still show 0 bytes.
>> Any other suggestions for trying to figure this out?
>>
>>  Regards,
>>
>>  Don
>>
>>   From: Larry Barber <lebarber at gmail.com>
>> Date: Mon, 5 Nov 2012 17:19:53 -0600
>> To: Don Kuhlman <don.kuhlman at schawk.com>
>>
>> Subject: Re: [Xymon] FW: Troubleshooting Purple CONN and HTTP Tests in
>> Xymon 4.3.10
>>
>>  Xymonnet tends to be pretty quiet unless something goes wrong. You won't
>> be able to tell for sure until you get one of your purple storms.
>>
>>  Alerts are handled by a different module. Look in tasks.cfg to find it.
>>
>>  Thanks,
>> Larry Barber
>>
>> On Mon, Nov 5, 2012 at 3:53 PM, Don Kuhlman <Don.Kuhlman at schawk.com>wrote:
>>
>>>  Hi Larry/all.  I've noticed that the xymonnet.log and
>>> xymonnet-again.log files are staying at 0 bytes.  Does that seem to be
>>> indicating a problem?
>>> (and Xymon hasn't gone purple all day, but I'm still not sending any
>>> email alerts to anyone).
>>>
>>>  -rw-rw-rw- 1 xymon xymon        0 Nov  5 15:05
>>> /var/log/xymon/xymonnet-again.log
>>> -rw-rw-rw- 1 xymon xymon        0 Nov  5 15:07
>>> /var/log/xymon/xymonnet.log
>>>
>>>  Thanks
>>>
>>>  Don K
>>>
>>>
>>>
>>>   From: Larry Barber <lebarber at gmail.com>
>>> Date: Mon, 5 Nov 2012 11:19:32 -0600
>>> To: Don Kuhlman <don.kuhlman at schawk.com>
>>>  Cc: Xymon Email List <xymon at xymon.com>
>>> Subject: Re: [Xymon] FW: Troubleshooting Purple CONN and HTTP Tests in
>>> Xymon 4.3.10
>>>
>>>  All the server side Xymon logs are in /var/log/xymon by default. Since
>>> you say that you are getting purple storms for conn and http tests, this
>>> suggests that the problem is likely with your xymonnet process. Check the
>>> xymonnet log, and when you see the purples check to see if there is a
>>> xymonnet instance running. If this instance has been running for more than
>>> a few minutes, kill it. If the xymonnet process is hanging, you might want
>>> to set the MAXTIME parameter on the xymonnet process in tasks.cfg. Doesn't
>>> really fix the problem, but it will at least stop things from going
>>> purple.
>>>
>>>  Thanks,
>>> Larry Barber
>>>
>>> On Mon, Nov 5, 2012 at 10:01 AM, Don Kuhlman <Don.Kuhlman at schawk.com>wrote:
>>>
>>>>  Update to this. While googling further, I saw a thread titled
>>>> "[hobbit] stale alerts".  This mentioned that there could be an external
>>>> script that I created which may cause issues for xymon when it runs.  I do
>>>> have a diskstat.sh script that may be causing problems. For now, I'm
>>>> setting it to DISABLED in the tasks.cfg file.
>>>>
>>>>  Is there a way to see log information in xymon to try and verify
>>>> something like this?
>>>>
>>>>  Thanks
>>>>
>>>>  Don K
>>>>
>>>>   From: Don Kuhlman <don.kuhlman at schawk.com>
>>>> Date: Mon, 5 Nov 2012 08:34:29 -0600
>>>> To: Xymon Email List <xymon at xymon.com>
>>>> Subject: Troubleshooting Purple CONN and HTTP Tests in Xymon 4.3.10
>>>>
>>>>   Hi folks.  We've been running xymon for about 10 months now. It's
>>>> been fine all this time.
>>>>
>>>>  However last week around Wednesday we started getting purple storms
>>>> on the CONN and HTTP tests for all our hosts.
>>>> I stop Xymon and restart it, or reboot the server (Linux 5.x) and then
>>>> it comes back ok.
>>>> This also happened Thursday, and then again Saturday around 2PM cst.
>>>>
>>>>  Anyone have a link or source for which logs to look in on the server
>>>> or xymon to see what may be causing the CONN and HTTP tests to randomly
>>>> start failing like this or where to start troubleshooting?
>>>>
>>>>  Can I use xymonlaunch —debug like this to see what is happening?
>>>>          /usr/lib64/xymon/server/bin/xymonlaunch --debug
>>>> --config=/usr/lib64/xymon/server/etc/tasks.cfg
>>>> --env=/usr/lib64/etc/xymonserver.cfg
>>>>
>>>>
>>>>
>>>>  While searching the xymon forum and message boards, I saw some things
>>>> that say it may be disk space or inodes, but it seems like we are ok there -
>>>>  df -i
>>>> Filesystem            Inodes   IUsed   IFree IUse% Mounted on
>>>> /dev/sda2            3899392  204731 3694661    6% /
>>>> tmpfs                 490139       6  490133    1% /dev/shm
>>>> /dev/sda1              32768      51   32717    1% /boot
>>>>
>>>>  df
>>>>  Filesystem           1K-blocks      Used Available Use% Mounted on
>>>> /dev/sda2             61312028   5748784  52448700  10% /
>>>> tmpfs                  1960556       188   1960368   1% /dev/shm
>>>> /dev/sda1               516040     87716    402112  18% /boot
>>>>
>>>>  DNS also seems fine.
>>>>
>>>>  Thanks
>>>>
>>>>  Don K
>>>>
>>>> _______________________________________________
>>>> Xymon mailing list
>>>> Xymon at xymon.com
>>>> http://lists.xymon.com/mailman/listinfo/xymon
>>>>
>>>>
>>>
>>
>> _______________________________________________
>> Xymon mailing list
>> Xymon at xymon.com
>> http://lists.xymon.com/mailman/listinfo/xymon
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20121106/8e409748/attachment.html>


More information about the Xymon mailing list