<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto"><div>Possibly worth chiming in here that I use the holidays list, in case anyone is thinking of "simplifying" the code. :-)<br><br>--<br><span style="background-color: rgba(255, 255, 255, 0);">____ *Note: UMDNJ is now Rutgers-Biomedical and Health Sciences*<br>|| \\UTGERS      |---------------------*O*---------------------<br>||_// Biomedical | Ryan Novosielski - Senior Technologist<br>|| \\ and Health | <a href="mailto:novosirj@rutgers.edu" x-apple-data-detectors="true" x-apple-data-detectors-type="link" x-apple-data-detectors-result="3">novosirj@rutgers.edu</a>- 973/972.0922 (2x0922)<br>||  \\  Sciences | OIRT/High Perf & Res Comp - MSB C630, Newark<br>    `'</span></div><div><br>On Feb 2, 2016, at 09:42, J.C. Cleaver <<a href="mailto:cleaver@terabithia.org">cleaver@terabithia.org</a>> wrote:<br><br></div><blockquote type="cite"><div><span>On Mon, February 1, 2016 4:59 pm, John Thurston wrote:</span><br><blockquote type="cite"><span>On 2/1/2016 2:41 PM, J.C. Cleaver wrote:</span><br></blockquote><blockquote type="cite"><blockquote type="cite"><span>Hi,</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Actually, I think I must have missed your final response on this at</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span><a href="http://lists.xymon.com/pipermail/xymon/2015-December/042787.html">http://lists.xymon.com/pipermail/xymon/2015-December/042787.html</a> ; my</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>apologies.</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>On what's happening, I think this might be a side-effect of</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span><a href="https://sourceforge.net/p/xymon/code/7651/">https://sourceforge.net/p/xymon/code/7651/</a> , which added a dummy record</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>for the purposes of command-line --test functionality when the host</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>doesn't exist. For an incoming unknown host (from xymond_alert's</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>perspective), the same path is being executed.</span><br></blockquote></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>I've applied the patch to my non-production server and performed my</span><br></blockquote><blockquote type="cite"><span>failure-reproduction steps. The behavior is certainly better. The alert</span><br></blockquote><blockquote type="cite"><span>process is no longer tanking for every message received :)</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>What I do get, for a newly added host, is "Checking criteria for host</span><br></blockquote><blockquote type="cite"><span>'<a href="http://foo.bar.com">foo.bar.com</a>', which is not defined. Will not alert until hostlist</span><br></blockquote><blockquote type="cite"><span>reload."  This happens following all subsequent runs of xymonnet.</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>Is there anything which will trigger a hostlist reload?</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>Is there a tidy way to manually reload the list?</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>It doesn't seem to happen until I kill the "xymond_channel</span><br></blockquote><blockquote type="cite"><span>--channel=page" process. This seems like a hamfisted thing to do after</span><br></blockquote><blockquote type="cite"><span>every edit of hosts.cfg :(</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>Related question:</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>If this is in main code, and not some odd-ball null/EOF/posix problem</span><br></blockquote><blockquote type="cite"><span>(as has often tripped up my Solaris systems in the recent past), why am</span><br></blockquote><blockquote type="cite"><span>I the only one seeing this failure? Why aren't the folks running linux</span><br></blockquote><blockquote type="cite"><span>having their alerts fail?</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><span></span><br><span></span><br><span></span><br><span>This one took me quite a while to figure out, mainly because I was looking</span><br><span>at the wrong code base for a while.</span><br><span></span><br><span>It turns out the host info record here is *only* used for display groups</span><br><span>and holiday lookups (probably rarely used), within the context of</span><br><span>alerting. In all other cases, it not being in the hostlist doesn't impact</span><br><span>the application of alert rule, since all the needed info is coming in via</span><br><span>the '@@page' message itself. The patch should be updated to let those come</span><br><span>straight through instead of exiting out if it doesn't see it.</span><br><span></span><br><span></span><br><span>My confusion came from different issue: xymond_alert actually never</span><br><span>reloads the hosts config at all! I found/fixed this back in Sept '14 in</span><br><span>the RPMs but it wasn't applied into 4.3 back then.</span><br><span></span><br><span>I'd been living with that code for so long I forgot that that reload</span><br><span>wasn't needed here -- and, obviously, alerts have been working *in</span><br><span>general*... (We only noticed the lack of reload because we were dependent</span><br><span>on a dynamic value in the hosts.cfg line coming through to the alert</span><br><span>script via XMH_RAW in updated form.)</span><br><span></span><br><span>xymond_alert reloading was put into 4.4 at</span><br><span><a href="https://sourceforge.net/p/xymon/code/7776/">https://sourceforge.net/p/xymon/code/7776/</a> among the patch bursts, but the</span><br><span>live host add issue has probably been in since this release. There are a</span><br><span>few takeaways from this... but this needs to be fixed in 4.3 (among</span><br><span>several other incoming issues that are pending confirmation).</span><br><span></span><br><span></span><br><span>Can you please check the included two patches? One is an update for the</span><br><span>previous one, which passes the alert check through (only adding the dummy</span><br><span>record in --test mode to begin with), the other adds hosts.cfg reloading</span><br><span>on intervals or on demand. It's based on the 4.4 version, but with only a</span><br><span>small change. I'd like to add both, as I can't see any drawback to</span><br><span>reloading hosts.cfg from xymond_alert's perspective, but the first may be</span><br><span>sufficient to get back to the status quo.</span><br><span></span><br><span></span><br><span>Regards,</span><br><span></span><br><span>-jc</span></div></blockquote><blockquote type="cite"><div><localalertmode-2.patch></div></blockquote><blockquote type="cite"><div><reloadalert.patch></div></blockquote><blockquote type="cite"><div><span>_______________________________________________</span><br><span>Xymon mailing list</span><br><span><a href="mailto:Xymon@xymon.com">Xymon@xymon.com</a></span><br><span><a href="http://lists.xymon.com/mailman/listinfo/xymon">http://lists.xymon.com/mailman/listinfo/xymon</a></span><br></div></blockquote></body></html>