[Xymon] SPLITNCV with wildcard because of multiple graphs from one test

Oliver R. r.oliver at web.de
Thu Oct 1 13:56:54 CEST 2020


Am 01.10.20 um 12:33 schrieb damien at makelofine.org:
> Le 2020-10-01 11:33, Oliver R. a écrit :
>> I've tested even more combinations trying to escape the asterisk with no
>> success:
>>
>> SPLITNCV_nvme="temperature:GAUGE"
>> SPLITNCV_nvme="*temperature:GAUGE"
>> SPLITNCV_nvme="*temperature*:GAUGE"
>> SPLITNCV_nvme="\*temperature:GAUGE"
>> SPLITNCV_nvme="@RRDIDX at _temperature:GAUGE"
>> SPLITNCV_nvme="p at RRDIDX@_temperature:GAUGE"
>> SPLITNCV_nvme=".*temperature:GAUGE"
>> SPLITNCV_nvme="%.*temperature:GAUGE"
>> SPLITNCV_nvme="(.*)temperature:GAUGE"
>> SPLITNCV_nvme="\.\*temperature:GAUGE"
>> SPLITNCV_nvme="\\*temperature:GAUGE"
>> SPLITNCV_nvme="\\\*temperature:GAUGE"
>> SPLITNCV_nvme="\\\\*temperature:GAUGE"
>> SPLITNCV_nvme="\\\\\*temperature:GAUGE"
>> SPLITNCV_nvme="\\\\\\*temperature:GAUGE"
>> SPLITNCV_nvme="\\\\\\\\*temperature:GAUGE"
>> SPLITNCV_nvme="\\.\\*temperature:GAUGE"
>>
>> On the xymon side I think this information above is processed in
>> "do_ncv.c", but I still could not find, where the string above gets
>> parsed.
>>
>> Any help is appreciated.
>>
>> Regards
>>
>> Oliver
>>
>>
>> Am 29.09.20 um 16:24 schrieb Oliver R.:
>>> Dear All,
>>>
>>> I have a custom test that reports data from nvme disks from the command:
>>>
>>> nvme smart-log /dev/nvme2n1 -o json
>>>
>>> The output is processed so that the client reports the following to the
>>> server:
>>>
>>> S3ESNX0J951626N-critical_warning : 0
>>> S3ESNX0J951626N-temperature : 318
>>> S3ESNX0J951626N-avail_spare : 100
>>> S3ESNX0J951626N-spare_thresh : 10
>>> S3ESNX0J951626N-percent_used : 92
>>> S3ESNX0J951626N-data_units_read : 27832088
>>> S3ESNX0J951626N-data_units_written : 93877408
>>> S3ESNX0J951626N-host_read_commands : 180442558
>>> S3ESNX0J951626N-host_write_commands : 916278700
>>> S3ESNX0J951626N-controller_busy_time : 4028
>>> S3ESNX0J951626N-power_cycles : 218
>>> S3ESNX0J951626N-power_on_hours : 2995
>>> S3ESNX0J951626N-unsafe_shutdowns : 86
>>> S3ESNX0J951626N-media_errors : 0
>>> S3ESNX0J951626N-num_err_log_entries : 0
>>> S3ESNX0J951626N-warning_temp_time : 0
>>> S3ESNX0J951626N-critical_comp_time : 0
>>> S3ESNX0J951626N-temperature_sensor_1 : 318
>>> S3ESNX0J951626N-temperature_sensor_2 : 323
>>> S3ESNX0J951626N-thm_temp1_trans_count : 0
>>> S3ESNX0J951626N-thm_temp2_trans_count : 0
>>> S3ESNX0J951626N-thm_temp1_total_time : 0
>>> S3ESNX0J951626N-thm_temp2_total_time : 0
>>> 2J4520102682-critical_warning : 0
>>> 2J4520102682-temperature : 314
>>> 2J4520102682-avail_spare : 100
>>> 2J4520102682-spare_thresh : 32
>>> 2J4520102682-percent_used : 2
>>> 2J4520102682-data_units_read : 9450966
>>> 2J4520102682-data_units_written : 24105094
>>> 2J4520102682-host_read_commands : 137338588
>>> 2J4520102682-host_write_commands : 284702582
>>> 2J4520102682-controller_busy_time : 0
>>> 2J4520102682-power_cycles : 46
>>> 2J4520102682-power_on_hours : 1107
>>> 2J4520102682-unsafe_shutdowns : 10
>>> 2J4520102682-media_errors : 0
>>> 2J4520102682-num_err_log_entries : 0
>>> 2J4520102682-warning_temp_time : 0
>>> 2J4520102682-critical_comp_time : 0
>>> 2J4520102682-thm_temp1_trans_count : 0
>>> 2J4520102682-thm_temp2_trans_count : 0
>>> 2J4520102682-thm_temp1_total_time : 0
>>> 2J4520102682-thm_temp2_total_time : 0
>>> S3ESNX0J951635M-critical_warning : 0
>>> S3ESNX0J951635M-temperature : 310
>>> S3ESNX0J951635M-avail_spare : 100
>>> S3ESNX0J951635M-spare_thresh : 10
>>> S3ESNX0J951635M-percent_used : 92
>>> S3ESNX0J951635M-data_units_read : 32693378
>>> S3ESNX0J951635M-data_units_written : 95742837
>>> S3ESNX0J951635M-host_read_commands : 213266959
>>> S3ESNX0J951635M-host_write_commands : 918085461
>>> S3ESNX0J951635M-controller_busy_time : 4280
>>> S3ESNX0J951635M-power_cycles : 218
>>> S3ESNX0J951635M-power_on_hours : 3072
>>> S3ESNX0J951635M-unsafe_shutdowns : 86
>>> S3ESNX0J951635M-media_errors : 0
>>> S3ESNX0J951635M-num_err_log_entries : 1
>>> S3ESNX0J951635M-warning_temp_time : 0
>>> S3ESNX0J951635M-critical_comp_time : 0
>>> S3ESNX0J951635M-temperature_sensor_1 : 310
>>> S3ESNX0J951635M-temperature_sensor_2 : 320
>>> S3ESNX0J951635M-thm_temp1_trans_count : 0
>>> S3ESNX0J951635M-thm_temp2_trans_count : 0
>>> S3ESNX0J951635M-thm_temp1_total_time : 0
>>> S3ESNX0J951635M-thm_temp2_total_time : 0
>>>
>>> As you can see, there are three disks, that have the same metrics with
>>> different values. Now I started with a xymonserver.d/nvme.cfg looking
>>> like this:
>>>
>>> TEST2RRD="$TEST2RRD,nvme=ncv"
>>> SPLITNCV_nvme="*:GAUGE"
>>> GRAPHS_nvme="nvmecriticalwarning,nvmetemperature,nvmeavailspare,nvmesparethresh,nvmepercentused,nvmedataunitsread,nvmedataunitswritten,nvmehostreadcommands,nvmehostwritecommands,nvmecontrollerbusytime,nvmepowercycles,nvmepoweronhours,nvmeunsafeshutdowns,nvmemediaerrors,nvmenumerrlogentries,nvmewarningtemptime,nvmecriticalcomptime,nvmetemperaturesensor1,nvmetemperaturesensor2,nvmethmtemp1transcount,nvmethmtemp2transcount,nvmethmtemp1totaltime,nvmethmtemp2totaltime"
>>>
>>>
>>>
>>> This causes all rrd files beeing created correctly like this:
>>>
>>> $ ls -1 /var/lib/xymon/rrd/wsrbreb/nvme*temperature.rrd
>>> /var/lib/xymon/rrd/wsrbreb/nvme,2J4520102682_temperature.rrd
>>> /var/lib/xymon/rrd/wsrbreb/nvme,S3ESNX0J951626N_temperature.rrd
>>> /var/lib/xymon/rrd/wsrbreb/nvme,S3ESNX0J951635M_temperature.rrd
>>>
>>> Unfortunately all datasets are now saved as datatype "GAUGE", but most
>>> need to have "DERIVE", but I cannot figure out how to do this. Here is
>>> what I've tried:
>>>
>>> SPLITNCV_nvme="temperture:GAUGE"
>>> SPLITNCV_nvme="*temperture:GAUGE"
>>> SPLITNCV_nvme=".*temperture:GAUGE"
>>> SPLITNCV_nvme="%.*temperture:GAUGE"
>>>
>>>
>>> One thing that does work is the following:
>>>
>>> SPLITNCV_nvme="S3ESNX0J951626N_temperature:GAUGE"
>>>
>>> But as you can see it requieres the name of the SSD, which cancels out
>>> all dynamics.
>>>
>>> How can I work with wildcards in SPLITNCV scenarios?
>>>
>>> Thank you for your help!
>>>
>>> Regards
>> _______________________________________________
>> Xymon mailing list
>> Xymon at xymon.com
>> http://lists.xymon.com/mailman/listinfo/xymon
>
>
> Hello Oliver,
>
> On my side, I'm using the following entry:
> SPLITNCV_postfix="*:GAUGE"
>
>
> This config creates the following files:
> postfix,Corrupt_Mails.rrd
> postfix,Mails_active.rrd
> postfix,Mails_in_deferred_State.rrd
> postfix,Incoming_Mails.rrd
> postfix,Mails_bouncing.rrd
>
>
> In a general manner, the SPLITNCV will create files based on test name
> (1st param), item name (2nd param) whith the following name structure:
> $test$,$item$.rrd
>
> Then you can work dynamically from these values.

Thank you for the reply! The core problem is, that the item name has a
dynamic part at the beginning (serial number of the nvme disk) and a
static part at the end like temperature or data_units_read. As mentioned
earlier, my files look like this:

nvme,2J4520102682_avail_spare.rrd
nvme,2J4520102682_temperature.rrd
nvme,2J4520102682_percent_used.rrd
nvme,S3ESNX0J951626N_avail_spare.rrd
nvme,S3ESNX0J951626N_temperature.rrd
nvme,S3ESNX0J951626N_percent_used.rrd
nvme,S3ESNX0J951635M_temperature.rrd
nvme,S3ESNX0J951635M_avail_spare.rrd
nvme,S3ESNX0J951635M_percent_used.rrd

Your suggestion of using "*:GAUGE" will make all metrics GAUGE, but this
does not work due to metrics like "data_units_read" has an only
increasing value.

So the question is how to use wildcards in SPLITNCV. The asterisk alone
has a special meaning of "define everything that does not match to
GAUGE/NONE/...", but how can I use a regex or some sort of dynamic
naming in the SPLITNCV_ variable?

Regards

Oliver


More information about the Xymon mailing list