[Xymon] xymond crashing! -- Please help!

Matt Vander Werf matt1299 at gmail.com
Sat Jan 30 18:33:18 CET 2016


See below for a snippet from the xymonlaunch.log log file from starting up
to crashing to trying to restart itself to crashing some more.

Looking through the xymonlaunch log, it looks like this kind of pattern
shows up quite a few other times, dating back to November last year. But it
always seemed to resolve itself after a good while.

Not sure how long it will take to resolve itself this time (if it resolves
itself at all).

Any guidance is appreciated and let me know if there's anything I can do or
provide to help figure out this issue!

Thanks!!


2016-01-30 12:07:40 xymonlaunch starting
2016-01-30 12:07:40.975237 Loading tasklist configuration from
/etc/xymon/tasks.cfg
2016-01-30 12:07:40 xymonlaunch: starting task [xymond]
2016-01-30 12:07:47 xymonlaunch: starting task [history]
2016-01-30 12:07:47 xymonlaunch: starting task [alert]
2016-01-30 12:07:47 xymonlaunch: starting task [clientdata]
2016-01-30 12:07:47 xymonlaunch: starting task [rrdstatus]
2016-01-30 12:07:47 xymonlaunch: starting task [rrddata]
2016-01-30 12:07:47 xymonlaunch: starting task [hostdata]
2016-01-30 12:07:47 xymonlaunch: starting task [storestatus]
2016-01-30 12:08:05.802277 Task xymond terminated by signal 6
2016-01-30 12:08:05 xymonlaunch: starting task [xymond]
2016-01-30 12:08:05.803642 Task xymonnet terminated by signal 15
2016-01-30 12:08:06.348868 Task xymond terminated, status 1
2016-01-30 12:08:11 xymonlaunch: starting task [xymond]
2016-01-30 12:08:11.896124 Task xymond terminated, status 1
2016-01-30 12:08:16 xymonlaunch: starting task [xymond]
2016-01-30 12:08:17.438144 Task xymond terminated, status 1
2016-01-30 12:08:22 xymonlaunch: starting task [xymond]
2016-01-30 12:08:22.981957 Task xymond terminated, status 1
2016-01-30 12:08:27 xymonlaunch: starting task [xymond]
2016-01-30 12:08:28.530953 Task xymond terminated, status 1
2016-01-30 12:08:28.531006 Postponing restart of [xymond] for 600 seconds
from last start due to multiple failures
2016-01-30 12:18:31.888403 Releasing [xymond] from failure hold
2016-01-30 12:18:31 xymonlaunch: starting task [xymond]
2016-01-30 12:18:36 xymonlaunch: starting task [history]
2016-01-30 12:18:36 xymonlaunch: starting task [alert]
2016-01-30 12:18:36 xymonlaunch: starting task [clientdata]
2016-01-30 12:18:36 xymonlaunch: starting task [rrdstatus]
2016-01-30 12:18:36 xymonlaunch: starting task [rrddata]
2016-01-30 12:18:36 xymonlaunch: starting task [hostdata]
2016-01-30 12:18:36.888969 Releasing [xymonnet] from failure hold
2016-01-30 12:18:36 xymonlaunch: starting task [storestatus]
2016-01-30 12:19:57.503293 Task xymond terminated by signal 6
2016-01-30 12:19:57 xymonlaunch: starting task [xymond]
2016-01-30 12:19:58.062318 Task xymond terminated, status 1
2016-01-30 12:20:03 xymonlaunch: starting task [xymond]
2016-01-30 12:20:03.607835 Task xymond terminated, status 1
2016-01-30 12:20:08 xymonlaunch: starting task [xymond]
2016-01-30 12:20:09.154910 Task xymond terminated, status 1
2016-01-30 12:20:14 xymonlaunch: starting task [xymond]
2016-01-30 12:20:14.702550 Task xymond terminated, status 1
2016-01-30 12:20:19 xymonlaunch: starting task [xymond]
2016-01-30 12:20:20.247913 Task xymond terminated, status 1
2016-01-30 12:20:20.247966 Postponing restart of [xymond] for 600 seconds
from last start due to multiple failures

--
Matt Vander Werf

On Sat, Jan 30, 2016 at 11:28 AM, Matt Vander Werf <matt1299 at gmail.com>
wrote:

> As a followup, xymond seems to try and start itself up again after a while
> (probably because xymonlaunch is still running) and goes for a short while
> working just fine and then just crashes again with the same messages and
> results.
>
> --
> Matt Vander Werf
>
> On Sat, Jan 30, 2016 at 11:21 AM, Matt Vander Werf <matt1299 at gmail.com>
> wrote:
>
>> Hello,
>>
>> I'm having a major issue with xymond crashing shortly after the service
>> starts.
>>
>> I'm using the the latest Terabithia RPM for RHEL 7
>> (4.3.24-3.el7.terabithia).
>>
>> When I check the status of the xymon service, it shows it as up but with
>> only the xymonlaunch parent process and vmstat processes. Upon restarting
>> the service, I see it start normally (all the normal channel processes,
>> etc.) and then after a while they all go away, leaving the following
>> process behind:
>>
>>            ├─2760 xymon-signal 0.0.0.0 status+1d/group:signal <server
>> hostname>.xymond red (Check time of report) - xymond program crashed Fatal
>> signal caught!
>>
>> along with the xymonlaunch process and some vmstat processes. After a
>> while that process goes away. Sometimes a single xymond_rrd will show up
>> alongside the xymonlaunch and vmstat processes as well after a little while.
>>
>> I'm already running xymond in --debug mode.
>>
>> This is what I see in the xymond log around the time of the crash:
>>
>> 2773 2016-01-30 11:02:32.515505 Status: Host=<host>, test=ntp
>> 2773 2016-01-30 11:02:32.515507  -- create_hostlist_t for <host> (<client
>> IP address>)
>> 2773 2016-01-30 11:02:32.515513 Status: Host=<host>, test=conn
>> 2773 2016-01-30 11:02:32.515520 Status: Host=<host>, test=raid
>> 2773 2016-01-30 11:02:32.515529 Status: Host=<host>, test=memory
>> 2773 2016-01-30 11:02:32.515534 Status: Host=<host>, test=files
>> 2773 2016-01-30 11:02:32.515670 Status: Host=<host>, test=procs
>> 2773 2016-01-30 11:02:32.515879 Status: Host=<host>, test=inode
>> 2773 2016-01-30 11:02:32.515891 Status: Host=<host>, test=disk
>> 2773 2016-01-30 11:02:32.516004 Status: Host=<host>, test=cpu
>> 2773 2016-01-30 11:02:32.516605 Loaded 14419 status logs
>> 2016-01-30 11:02:32 Setting up network listener on 0.0.0.0:1984
>> 2016-01-30 11:02:32.516677 Cannot bind to listen socket (Address already
>> in use)
>> 2016-01-30 11:02:59.538906 Whoops ! Failed to send message (Timeout)
>> 2016-01-30 11:02:59.539020 ->
>> 2016-01-30 11:02:59.539023 ->  Recipient '<server IP address>', timeout 50
>> 2016-01-30 11:02:59.539024 ->  1st line: 'status+1d/group:signal <server
>> hostname>.xymond red (Check time of report) - xymond program crashed'
>>
>> It seems to get finished with loading all the hosts and then it crashes
>> (the last host before it crashes is the last client I have alphabetically).
>>
>> I've tried stopping the service, killing off any remaining xymon owned
>> processes, and started the service with the same results. I've also tried
>> restarting the xymon server machine itself, with the same crash happening
>> when the service starts the first time.
>>
>> This just started happening out of the blue a couple of hours ago...
>>
>> Looking in netstat, there are no active connections using port 1984 on
>> the local side, just a bunch of clients trying to connect to the server
>> with 1984 in the foreign address.
>>
>> ANY help would be much appreciated as currently our Xymon server is not
>> working!!
>>
>> Thanks!!
>>
>> --
>> Matt Vander Werf
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20160130/0212a61e/attachment.html>


More information about the Xymon mailing list