Hobbit client to monitor Solaris RAID devices

Ward, Martin Martin.Ward at colt.net
Fri Jun 20 13:15:36 CEST 2008


All,

I have been using a script called bb-raid.sh for a few years to monitor
Solaris meta devices. I then came across a Solaris server using raidctl
for hardware raid, so I modified bb-raid.sh to monitor this one as well.

Here is the code. Feel free to use it as you wish:

|\/|
====
#!/bin/sh
#
# meta: Sun Enterprise Server check - BB external script test
#
#####  Purpose is to report back to a central server, all Solaris
#####     DiskSuite RAID software faults.
#####
#
# version 1.0
# version 2.0 - properly uses $THIS_HOST instead of $MACHINE due to fqdn
using
#                 comma in name
#               removed all direct program calls in favor of env vars in
bbsys.*
#               changed /tmp to $BBTMP
#                       meta to $TEST
#                       $THIS_HOST to $MACHINE
#               removed code that checked for the existance of "meta"
#                 after the server name in bb-hosts
#               moved comment explaining purpose of script to start of
code
#               caused output of metadb, metastat, and metahs to be
displayed
#                 on the web page
# version 2.1 - Warns if default values are not used
#               Applied changes by Todd Jimenez
#               - created get_header and get_footer functions
#               - set summary value
#               - added &red to red alerts
#               - added &yellow to yellow alerts
#               restored functionality to check for the existence of
"meta"
#                 after the server name in bb-hosts, but now optional
# version 2.2 - Change status to &yellow when disk is resyncing.
# version 2.3 - Added code to check for raidctl command and perfrom raid
checks
#               using that if it's there.
#               Using this version REQUIRES sudo !
#               raidctl(1m) on Solaris can only be run as root. Sudo is
a
#               nice workaround.
#
# BIG BROTHER / XXXXXXXXXXXXXXXX status
#
# Written by Galen Johnson
#  on October 25, 2000
#
# Inspired by a module of the perl clone of BB written
# by Charles Hall in August 1998
#
# Based on code found in the DiskSuite manual
#
# 2.0 Updates by Mike Arnold <mike at razorsedge dot org>
#  on September 27, 2001
#
# 2.1 Updates by Mike Arnold <mike at razorsedge dot org>
#  on November 23, 2001
#
# 2.2 Updates by Martin Ward <martin dot ward at colt dot net>
#  on Monday 19th May 2008
#
# 2.3 Updates by Martin Ward <martin dot ward at colt dot net>
#  on Monday 19th June 2008

########################################
# NOTE
# This has been tested with BB 1.8c and Hobbit 4.2.0
#
# Tested on :
#   Sun 220R, 420R, E250/450, E4500
########################################

########################################
# INSTALLATION
#  step 1  - update bb-bbexttab to include this meta
#            (older BB versions update EXT section of the bbdef.sh
script)
#
#  step 2 - copy lines mentioned to bbsys.local (without the #'s)
#
#  step 3 - if you are using an older version of BB without bb-bbexttab
#             and you don't want this run on every client uncomment
#             CHECK_BB_HOSTS="Y" and add the name of this $TEST to
#             bb-hosts for this client. eg.
#             myserver1.domain.com   # meta
#
#  step 4 - If you have sudo installed need to use the raidctl control,
#           add the following line into your /etc/sudoers file:
# hobbit          ALL=(root) NOPASSWD:/usr/sbin/raidctl -l
#  step 5 - restart Big Brother
#
# NOTE - the TEST variable in the configuration section, this is the
name used
#        as the column header.
########################################

##################################
# CONFIGURE IT HERE
##################################
TEST="raid"
SCRIPT_VER="bb-raid.sh v2.3"

BBPROG="$0"; export BBPROG
DEBUG=0
#
# Start of lines to put in bbsys.local
# NOTE: MDBIN can be either /usr/sbin or /usr/opt/SUNWmd/sbin
#

MDBIN=/usr/sbin
METADB=${MDBIN}/metadb
METAHS=${MDBIN}/metahs
METASTAT=${MDBIN}/metastat
SUDO=/opt/sfw/bin/sudo
RAIDCTL=/usr/sbin/raidctl
export MDBIN METADB METAHS METASTAT SUDO RAIDCTL

#
# End of lines to put in bbsys.local
#

# define colours for graphics
# Comment these out if using older BB versions
RED_PIC="&red"
YELLOW_PIC="&yellow"
GREEN_PIC="&green"

# don't scan through bb-hosts every time
# this is here for older BB versions without bb-bbexttab
# uncomment to activate
#CHECK_BB_HOSTS=Y

##################################
# Start of script
##################################
#BBHOME="/home/bb/bb"; export BBHOME

if test ! "$BBHOME"
then
        echo "template: BBHOME is not set"
        exit 1
fi

if test ! -d "$BBHOME"
then
        echo "template: BBHOME is invalid"
        exit 1
fi

if test ! "$BBTMP"                      # GET DEFINITIONS IF NEEDED
then
         # echo "*** LOADING BBDEF ***"
        . $BBHOME/etc/bbdef.sh          # INCLUDE STANDARD DEFINITIONS
fi

get_header()
{
  echo ""
  echo "<FONT SIZE=+2><b>$1</b></FONT> ($2)<BR>"
  # If you do not want the header in a bigger font use line below
instead
  #echo "<b>$1</b> ($2)"
  # If you want the "Paul Luzzi" look uncomment this section and comment
  # out the above sections:
  #echo "<P><DIV ALIGN=\"CENTER\"><HR>"
  #echo "<B>============== $1 ==============</B>"
  #echo "<B>--- ($2) ---</B>"
  #echo "<HR></DIV>"
  #echo "<BLOCKQUOTE>"
}

get_footer()
{
  echo ""
  # If you want the "Paul Luzzi" look uncomment this section and comment
  # out the above sections:
  #echo "</BLOCKQUOTE>"
}

#####
#####  Get MD Status proc - used if the server uses a metadb
#####
get_md_status()
{
  #####
  #####  Setup some variables for use later
  #####
  COLOR="green"

  # Check defaults have been set
  if [ "$MDBIN" = "" ]; then
    MDBIN=/usr/sbin
    echo ""
    echo "$YELLOW_PIC MDBIN command is not defined in etc/bbsys.local -
using de
fault: $MDBIN"
  fi

  if [ "$METADB" = "" ]; then
    METADB=${MDBIN}/metadb
    echo ""
    echo "$YELLOW_PIC METADB command is not defined in etc/bbsys.local -
using d
efault: $METADB"
  fi

  if [ "$METAHS" = "" ]; then
    METAHS=${MDBIN}/metahs
    echo ""
    echo "$YELLOW_PIC METAHS command is not defined in etc/bbsys.local -
using d
efault: $METAHS"
  fi

  if [ "$METASTAT" = "" ]; then
    METASTAT=${MDBIN}/metastat
    echo ""
    echo "$YELLOW_PIC METASTAT command is not defined in etc/bbsys.local
- using
 default: $METASTAT"
  fi

  ###
  ### Check replicas for problems, capital letters in the flags indicate
an erro
r.
  ###
  get_header "MetaDatabases" "$METADB -i"
  dbtrouble=`${METADB} 2>/dev/null | ${TAIL} +2 | ${AWK} '{ fl =
substr($0,1,20)
; if (fl ~ /[A-Z]/) print $0 }'`
  if [ "${dbtrouble}" ]; then
    COLOR="red"
    echo ""
    echo "$RED_PIC <B><I>Database replicas are not active: </I></B>"
    echo ""
    ${METADB} -i 2>/dev/null
  else
    ${METADB} -i 2>/dev/null
  fi
  get_footer

  ###
  ### Check the metadevice state, if the state is not Okay, something is
up.
  ###
  get_header "Metadevices" "$METASTAT"
  mdtrouble=`${METASTAT} | ${GREP} "Resyncing"`
  if [ "${mdtrouble}" ]; then
    COLOR="yellow"
    CLR_PIC="$YELLOW_PIC"
  fi
  mdtrouble=`${METASTAT} | ${GREP} -v "Resyncing" | ${AWK} '/State:/ {
if ( $2 !
= "Okay" ) print $0 }'`
  if [ "${mdtrouble}" ]; then
    COLOR="red"
    CLR_PIC="$RED_PIC"
  fi

  if [ "$COLOR" != "green" ]; then
    echo ""
    echo "$CLR_PIC <B><I>Metadevices are not Okay: </I></B>"
    echo ""
    ${METASTAT}
  else
    ${METASTAT}
  fi
  get_footer

  ###
  ### Check the hotspares to see if any have been used.
  ###
### Hotspare test disabled because writing to BBOUT when no hotspare
pools

# get_header "Hot Spares" "$METAHS -i"
# hstrouble=`${METAHS} -i |  ${AWK} ' /blocks/ { if ( $2 != "Available"
) print
$0 }'`
# if [ "${hstrouble}" ]; then
#   if [ COLOR != "red" ]; then
#     COLOR="yellow"
#   fi
#   echo ""
#   echo "$YELLOW_PIC <B><I>Hot spares in use: </I></B>"
#   echo ""
#   ${METAHS} -i 2>&1
# else
#   ${METAHS} -i 2>&1
# fi
# get_footer

  #####
  #####  Make sure to export COLOR so that it gets back to "central"
  #####
  export COLOR

#####
#####  End of get_md_status proc
#####
}

#####
#####  Get LSI Status proc - used if the server has LSI1030 or
#####  LSI1064 RAID-enabled controllers.
#####
get_lsi_status()
{
  #####
  #####  Setup some variables for use later
  #####
  COLOR="green"

  # Check defaults have been set
  if [ "$RAIDCTL" = "" ]; then
    RAIDCTL=/usr/sbin
    echo ""
    echo "$YELLOW_PIC RAIDCTL command is not defined in etc/bbsys.local
- using default: $RAIDCTL"
  fi

  ###
  ### Check replicas for problems. Strip off the header then print any
entries
  ### that do not have the work OK in them, or do have the words
"DEGRADED" or
  ### "FAILED".
  ###
  dberrors=`${SUDO} ${RAIDCTL} -l | ${AWK} 'BEGIN { D=0; }
/------/ { D=1; next; }
/DEGRADED|FAILED/ {     if (D==1) {print $0; next; } }
/       OK/ { if (D==1) { next; } }
{ if (D==1) {print "1 ",$0; } }' `

  if [ "${dberrors}" ]; then
    COLOR="red"
    echo ""
    echo "$RED_PIC <B><I>RAID errors exist: </I></B>"
    echo "<pre>"
    ${SUDO} ${RAIDCTL} -l
    echo "</pre>"
  else
    echo "<pre>"
    ${SUDO} ${RAIDCTL} -l
    echo "</pre>"
  fi
  get_footer

  #####
  #####  Make sure to export COLOR so that it gets back to "central"
  #####
  export COLOR

#####
#####  End of get_lsi_status proc
#####
}

#####
#####  Get Status proc - used to get all responses
#####
get_status()
{
  #####
  #####  Decide which RAID we have, then call the right subroutine.
  #####

  COLOR=WHITE
  export COLOR

  if [ "${RAIDCTL}" != "" ]; then
    TST=`${SUDO} ${RAIDCTL} -l | grep "No RAID volumes found" | wc -l`
    if [ $TST -lt 1 ]
    then
        get_lsi_status
    fi
  fi

  if [ "${METADB}" != "" ]; then
    TST=`${METADB} 2>&1 | grep "there are no existing databases" | wc
-l`
    if [ $TST -lt 1 ]
    then
        get_md_status
    fi
  fi

#####
#####  End of get_status proc
#####
}

#####
#####  Main body
#####
if [ "$CHECK_BB_HOSTS" = "Y" ]; then
  # convert "," to "." in the hostname
  MACHINE_WITH_DOTS=`echo $MACHINE | $SED 's/,/\./g'`

  $GREP $MACHINE_WITH_DOTS $BBHOSTS | $GREP "$TEST" |
  while read line
  do
    if [ ! -z "$line" ]; then
      get_status > $BBTMP/$MACHINE.$TEST

      ### Tack the script version on to the end
      echo "<table><tr><td align=right><font
size=-1>${SCRIPT_VER}</font></td></tr></table>" >> $BBTMP/$MACHINE.$TEST

      if [ ${DEBUG} = "1" ]; then
        echo "$BB $BBDISP \"status $BBTMP/$MACHINE.$TEST $COLOR" `$DATE`
`$CAT $BBTMP/$MACHINE.$TEST` "\"" >> $BBTMP/raid.output
      else
        # NOW USE THE BB COMMAND TO SEND THE DATA ACROSS
        $BB $BBDISP "status $BBTMP/$MACHINE.$TEST $COLOR `$DATE` `$CAT
$BBTMP/$MACHINE.$TEST` "
      fi
    fi
  done
else
  get_status > $BBTMP/$MACHINE.$TEST

  ### Tack the script version on to the end
  echo "<table><tr><td align=right><font
size=-1>${SCRIPT_VER}</font></td></tr></table>" >> $BBTMP/$MACHINE.$TEST

  if [ ${DEBUG} = "1" ]; then
    echo "$BB $BBDISP \"status $MACHINE.$TEST $COLOR" `$DATE` `$CAT
$BBTMP/$MACHINE.$TEST` "\"
" >> $BBTMP/raid.output
  else
  # NOW USE THE BB COMMAND TO SEND THE DATA ACROSS
  $BB $BBDISP "status $MACHINE.$TEST $COLOR `$DATE` `$CAT
$BBTMP/$MACHINE.$TEST`
 "
  fi
fi

# Clean up our mess
# Checking for existence of each file since the whole test may be
optional
#   and may not actually run on every client
#
if [ -f $BBTMP/$MACHINE.$TEST ]; then
  $RM $BBTMP/$MACHINE.$TEST
fi
##############################################
# end of script
##############################################
====
-- 
Martin Ward
Network Systems Operations Specialist
DDI:	+44 (0) 20 7863 5218
Fax: 	+44 (0) 20 7863 5610
Mob: 	+44 (0) 7971 97 77 21
www.colt.net

Data | Voice | Managed Services 

Help reduce your carbon footprint | Think before you print

COLT Telecommunications, Beaufort House, 15 St Botolph Street, London,
EC3A 7QN UK
Registered in England and Wales, registered number 02452736, VAT number
GB 645 4205 50



*************************************************************************************
The message is intended for the named addressee only and may not be disclosed to or used by anyone else, nor may it be copied in any way. 

The contents of this message and its attachments are confidential and may also be subject to legal privilege.  If you are not the named addressee and/or have received this message in error, please advise us by e-mailing security at colt.net and delete the message and any attachments without retaining any copies. 

Internet communications are not secure and COLT does not accept responsibility for this message, its contents nor responsibility for any viruses. 

No contracts can be created or varied on behalf of COLT Telecommunications, its subsidiaries or affiliates ("COLT") and any other party by email Communications unless expressly agreed in writing with such other party.  

Please note that incoming emails will be automatically scanned to eliminate potential viruses and unsolicited promotional emails. For more information refer to www.colt.net or contact us on +44(0)20 7390 3900.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xymon.com/pipermail/xymon/attachments/20080620/a0fc897c/attachment.html>


More information about the Xymon mailing list