[Xymon] xymonnet - fatal signal caught

Jeremy Laidman jlaidman at rebel-it.com.au
Wed Oct 15 04:00:17 CEST 2014


On 15 October 2014 10:27, Mark Felder <feld at feld.me> wrote:

> I see that this has come up several times before. From a 2007 thread
> Henrik mentioned that this can happen randomly and was unsure why. Do we
> have any better ideas how to debug this problem? Would be nice to be
> able to come up with a permanent solution for everyone so this doesn't
> happen and catch people off guard.


Being a random problem makes it difficult to track down the fault, but if
we have a willing participant that can reproduce the fault, we might be
able to make progress.  From reading the code, it looks like the ARES
library is resolving with success, but when xymonnet is copying the
resolved address into its own data structure, it fails to copy.  The
troublesome line 120 is:

memcpy(&dnsc->addr, *(hent->h_addr_list), sizeof(dnsc->addr));

(From my poor knowledge of C) some problems that can arise here are:
a) dhsc->addr or dsc is null
b) hent->h_addr_list or hent is null
c) dnsc->addr is larger than hent->h_addr_list

Perhaps we need to see the values of these.  Wallace, can you recompile
after inserting these lines immediately before line 120:

dbgprintf("ARES host=%s\n", hent->h_name);
dbgprintf("ARES status=%d name=%s\n", status, dnsc->name);
dbgprintf("ARES addr size=%d\n", sizeof(dnsc->addr));
dbgprintf("ARES addr hex=%#lx\n", dnsc->addr);
dbgprintf("ARES addr ascii=%s\n", inet_ntoa(dnsc->addr));

Assuming this compiles correctly for you (it did for me), backup the old
xymonnet, and copy the newly compiled on into place.  Then wait for a core
dump, and see what's in the logs.

Warning: This might break your monitoring, so you might not want to use
this on a production system, depending on your stability requirements.

Alternatively, you might see if you can reproduce the problem by running
the xymonnet binary manually, something like this:

xymonnet --debug --no-update name.of.server

If this dumps core, then you should be able to manually run the new binary
in the same way, and check the log output for our debug statements.

J
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20141015/627c467f/attachment.html>


More information about the Xymon mailing list