[Xymon] HPE BL460c Gen9 SNMP tests going clear

Jeremy Laidman jeremy at laidman.org
Wed Mar 13 11:38:07 CET 2019


Colin

Are all your different servers losing numbers all around the same time? The
fact that both Windows and Linux systems (with probably different snmpd
code) are having the same issues, suggests that it's a devmon problem.

It's been a while since I did any devmon stuff (mine has been running for
years but a low node/OID count seems to keep it humming along), but I seem
to recall there was some issue with 32-bit vs 64-bit SNMP request/response
IDs. What can happen is that a negative 32-bit request ID gets turned into
a positive and large 64-bit response ID that is actually the same, but
parsed as different by the SNMP library used by devmon. This might be a
problem only where there is a 32-bit SNMP library installed on a 64-bit OS,
or that might just be a red herring.

If this is a problem, a "tcpdump" with "-v" will show request/response IDs,
and you can confirm if you only get responses for positive request IDs.

There's an SNMP_Session API  variable you can set in the devmon script to
avoid negative request IDs. See
https://lists.oetiker.ch/pipermail/mrtg-developers/2002-September/000103.html
and
https://xymon.xymon.narkive.com/6ltXE2nB/devmon-tests-clear-but-snmpwalk-works#post8
.

Another issue might be the use of SNMPBulkGet, which results in lots of
packets in the response, all fragments of a large datagram. One day to
disable Bulk requests is to set SNMP version (snmpver) to 1 in your
devmon/templates/<devtype>/specs file. Most templates have "snmpver : 2".

The last time I tried to add a bunch of hosts to devmon, it failed, not
only for the hosts I added, but also for the hosts that were there before
the addition. I ended up reverse-engineering the as-yet-undocumented Xymon
SNMP features, and it seems to work well enough. I do sometimes get no
response from some hosts from time to time, but it's better than nothing.

In case you wanted to go down this path, it's just a matter of adding
entries into /etc/xymon/snmphosts.cfg like this:

[hostname.example.net]
  version=2
  community=secret
  ip=192.0.2.19
  systemmib
  ifmib=(*)
  icmpmib
  hrsystem
  hrstorage=(*)

The entries after "ip=" line are references to stanzas in the
/etc/xymon/snmpmibs.cfg file. There's special black magic in the
snmpmibs.cfg file that I haven't worked out, and probably needs code
inspection to understand, so I haven't dared to touch this file, with the
exception of one apparent typo; diff to fix it is here:

 [hrstorage]
 # storage has data for both memory- and disk-storage
        keyidx (HOST-RESOURCES-MIB::hrStorageDescr)
-       keyidx [(HOST-RESOURCES-MIB::hrStorageType]
+       keyidx [HOST-RESOURCES-MIB::hrStorageType]
        Type = HOST-RESOURCES-MIB::hrStorageType
        Description = HOST-RESOURCES-MIB::hrStorageDescr
        Units = HOST-RESOURCES-MIB::hrStorageAllocationUnits
/rrd:GAUGE

Hmm, perhaps I should try setting "version=1" and see if that helps my
occasional missing data...

J


On Wed, 13 Mar 2019 at 17:12, Colin Coe <colin.coe at gmail.com> wrote:

> Hi Bruce and thanks for the quick response.
>
> These Gen9's are a mix of RHEL6 and Windows 2016, and interestingly, both
> behave the same.  The RHEL boxen are running 'net-snmp' and for Windows
> 2016 it's the stock WIndows SNMP service.  We don't do SNMP traps.  These
> machines also have the HP Service Pack for Proliant installed.  The RHEL
> config can be summarized as:
> ---
> cat /etc/snmp/snmpd.conf
> dlmod cmaX /usr/lib64/libcmaX64.so
> rocommunity secret
> ---
>
> Thanks again
>
> On Wed, Mar 13, 2019 at 2:02 PM Bruce Ferrell <bferrell at baywinds.org>
> wrote:
>
>>
>> On 3/12/19 9:58 PM, Colin Coe wrote:
>> > Hi all
>> >
>> > All of our Gen9 servers are flicking between green and clear for Devmon
>> SNMP tests.  (Posting here as Devmon seems a dead project.)
>> >
>> > The attached screen region grab shows what I'm trying to say.
>> >
>> > I've updated the firmware (including iLO) on one server and it made no
>> difference.
>> >
>> > Anyone else seen this?
>> >
>> > Thanks
>> >
>> > CC
>> >
>> > _______________________________________________
>> > Xymon mailing list
>> > Xymon at xymon.com
>> > http://lists.xymon.com/mailman/listinfo/xymon
>> I see a lot of odd things with devmon. I have seen an issue something
>> like on my Dell R710 with OMSA.  It patches itself into snmpd via a loaded
>> library to report on Dell storage
>> and sometimes the response from that get's really slow and blocks
>> response to devmon.  An update a while back to OMSA cleared that up.
>>
>> I also have to run a cron job that does a kill -9 on it every two hours
>> and then restart it... It likes to gobble memory like it's going out of
>> style.
>>
>> People DO respond on the devmon list, just a little slowly.
>>
>> I run my devmon in multi node mode with MySQL backing.  when I do see
>> gross errors like this I turn on debug and watch the devmon log to see what
>> errors are thrown.
>>
>> What OS and SNMPD is running on the HP?  Especially the SNMPD... It
>> matters.  OS X and pFSense don't report some OIDs  correctly due to the
>> snmpd they run.  It's one of those
>> "snmpd/ya just gotta know about it" things.
>>
>>
>> _______________________________________________
> Xymon mailing list
> Xymon at xymon.com
> http://lists.xymon.com/mailman/listinfo/xymon
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20190313/4b6b3f98/attachment.html>


More information about the Xymon mailing list