[Xymon] DRBD monitoring
Adam Goryachev
adam at websitemanagers.com.au
Fri Feb 6 03:42:11 CET 2015
I've spent some time using DRBD over the past few years, and have been
slowly fixing how Xymon monitors that. I was hoping to share my work on
xymonton, but while I've created an account, I'm not familiar with a
dokuwiki, or how to create pages/etc.
In any case, I'm not finished yet, so I'll post what I have so far, and
perhaps someone else can upload for me, and/or discuss specific
improvements.... One specific issue is that I am still using "Hobbit",
rather than xymon, so a number of variable names, filenames, etc should
probably be updated.
The first tool I use is a perl script "check-drbd", I think I borrowed
this from a nagios plugin or similar, but I have made significant
modifications to it since then (see attached)
The second tool is a shell script called drbd, this simply calls the
check-drbd, and then reports the data back to xymon, as both a status
message (to display on screen) as well as a data message (to save into
the rrd files).
Also, the file for the client to actually run the script
clientlaunch_drbd.cfg
Finally, on the server side, we have the graph definition for the
hobbitgraph.d directory, which allows to see graphs of the data over
time (very useful) in the file hobbitgraph_drbd.cfg
Some things I would like to change/fix are:
1) to limit the graph to only one drbd device per graph. (There aren't
enough colours, and the graphs are too messy to read anyway).
2) be able to graph all drbd devices on one graph, but only one of the
values. eg, see all dw values on a single graph
3) be able to graph the sum of all the values, eg, similar to 2 (so only
all the dw values at once) but stacked so I can see each individual
colour as well as the total
I'm not sure the best way to make the above happen. I suppose I could
create another dozen graph definitions, but I'm hoping there might be a
better way.
Finally, now that the data is being added to the rrd, when I eventually
upgrade to a current release, I'll be able to do alerts based on the
values (eg, oos is too high - or non zero basically, or dw is too high
and therefore users will be seeing slow performance, etc).
Comments, suggestions, etc, greatly appreciated. If nothing else, I hope
this will be useful to someone else.
Effectively, this will display the data on the page like this (I have 18
drbd devices configured, 0 to 18 skipping 12):
Fri Feb 6 13:27:03 AEDT 2015 DRBD OK 0 [OK], 1 [OK], 2 [OK], 3
[OK], 4 [OK], 5 [OK], 6 [OK], 7 [OK], 8 [OK], 9 [OK], 10 [OK], 11
[OK], 13 [OK], 14 [OK], 15 [OK], 16 [OK], 17 [OK], 18 [OK]
drbd0 ns:59995764 nr:587 dw:59996747 dr:13440670 al:4315 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 oos:0
drbd1 ns:51177927 nr:1147 dw:51179332 dr:50184219 al:10914 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 oos:0
drbd2 ns:10390318 nr:0 dw:10390318 dr:17007325 al:2637 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 oos:0
drbd3 ns:473465956 nr:17032 dw:473475648 dr:1713963052 al:1850433 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 oos:0
drbd4 ns:5958452 nr:124 dw:5958580 dr:5663626 al:212 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 oos:0
drbd5 ns:413987326 nr:1531 dw:413989393 dr:245985408 al:158935 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 oos:0
drbd6 ns:17089136 nr:251116 dw:17342180 dr:116599072 al:1211 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 oos:0
drbd7 ns:0 nr:0 dw:0 dr:5662844 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 oos:0
drbd8 ns:63714682 nr:563 dw:63715479 dr:23877551 al:6582 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 oos:0
drbd9 ns:186293485 nr:362 dw:186293990 dr:480836789 al:59641 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 oos:0
drbd10 ns:59817651 nr:366 dw:59818057 dr:27584750 al:12039 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 oos:0
drbd11 ns:0 nr:0 dw:0 dr:664 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 oos:0
drbd13 ns:726120198 nr:474 dw:726126470 dr:1029163787 al:3647825 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 oos:0
drbd14 ns:783916639 nr:676 dw:783924754 dr:1103495545 al:4256267 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 oos:0
drbd15 ns:166529805 nr:0 dw:166529921 dr:83674337 al:139032 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 oos:0
drbd16 ns:0 nr:0 dw:0 dr:5661172 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 oos:0
drbd17 ns:1625540 nr:0 dw:1625540 dr:6601790 al:82 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 oos:0
drbd18 ns:56091131 nr:705 dw:56091944 dr:37456802 al:77287 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 oos:0
I was going to include the image, but it is 241k... take a look at:
graph image of drbd stats
Note, all attachments are tar + bzip2, reducing the original 44k down to
4k...
Regards,
Adam
--
Adam Goryachev
Website Managers
Ph: +61 2 8304 0000 adam at websitemanagers.com.au
Fax: +61 2 8304 0001 www.websitemanagers.com.au
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20150206/adfdd725/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: xymon_drbd.tar.bz2
Type: application/x-bzip
Size: 4411 bytes
Desc: not available
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20150206/adfdd725/attachment.bin>
More information about the Xymon
mailing list