[hobbit] phantom processes causing false alerts
shea4th at comcast.net
shea4th at comcast.net
Mon Apr 12 18:56:08 CEST 2010
DARN HTML Email
----- shea4th at comcast.net wrote:
>
>
> ----- "Odinn" <odinn_asgaard at yahoo.com> wrote:
> > It sound like your hobbit server is in several different groups in hobbit-clients.cfg
> > each group that it matches, it adds the checks from that group to its list of checks, it doesn't match a group and then leave the hobbit-clients, it's additive all the way to the end for groups that match.
> > example
> > PAGE=servers
> > PROC sshd
> > HOST=hobbitsrv
> > PROC sshd
> > PROC hobbitd
> > HOST=%(.*)srv(.*)
> > PROC telnetd
> > if hobbitsrv is in the servers page, then it will check procs for sshd, sshd (again), hobbitd, and telnetd
> > --
> > Jim Sloan
> > Just remember, today is the day you thought tomorrow was going to be yesterday.
> > ----- Original Message ----
> > From: "shea_greg at emc.com" <shea_greg at emc.com>
> > To: hobbit at hswn.dk
> > Cc: shea_greg at emc.com
> > Sent: Fri, April 9, 2010 5:35:15 PM
> > Subject: [hobbit] phantom processes causing false alerts
> > Hi all,
> > I have a strange problem with procs and am reaching out to the group for
> > help. In the picture below
> > there are numerous 'cron' and 'sshd' processes showing up for this
> > particular host (the hobbitserver).
> > The problem is, all these processes and in particular the ones in error
> > (red) don't belong on this server.
> > I have tried to locate where these phantom processes are coming from,
> > searching hist, histlog, hostdata
> > directories but I'm stuck. The Client Data file does not have any of
> > these processes.
> > Any assistance would be appreciated
> > Thanks
> > Gregory R Shea
> > EMC Corporation
> > ========================================================================
> > =
> > Entry from hobbit-alerts.cfg
> > ## HobbitServer
> > GROUP=HHEARTBEAT
> > SCRIPT /apps/hobbit/server/etc/hb-alert.pl heartbeat
> > ========================================================================
> > =
> > Entry from hobbit-clients.cfg
> > HOST=hobbitserver
> > LOAD 40.0 45.0
> > MEMPHYS 101 102
> > MEMACT 98 99
> > MEMSWAP 97 98
> > #PROC hobbitd_channel
> > PROC heartbeat 1 1 yellow GROUP=HHEARTBEAT
> > #PROC sendmail TRACK=sendmail
> > FILE /apps/hobbit/server/etc/bb-hosts yellow MTIME>600 TRACK
> > FILE /apps/hobbit/server/etc/hobbit-alerts.cfg yellow MTIME>600
> > TRACK
> > FILE /apps/hobbit/server/etc/hobbit-clients.cfg yellow MTIME>600
> > TRACK
> > PORT LOCAL=0.0.0.0:1984 TEXT=HobbitD
> > ========================================================================
> > ==
> > http://hobbitserver/hobbit-cgi/bb-hostsvc.sh?HOST=hobbitmon&SERVICE=proc
> > s
> > Mon Mar 22 20:55:39 EDT 2010 - Processes NOT ok
> > heartbeat (found 0, req. 1 or more) This should be
> > YELLOW
> > crond (found 1, req. 1 or more)
> > /usr/sbin/sshd (found 1, req. 1 or more)
> > cron (found 1, req. 1 or more)
> > sshd (found 5, req. 1 or more)
> > crond (found 1, req. 1 or more)
> > /usr/sbin/sshd (found 1, req. 1 or more)
> > crond (found 1, req. 1 or more)
> > /usr/sbin/sshd (found 1, req. 1 or more)
> > crond (found 1, req. 1 or more)
> > /usr/sbin/sshd (found 1, req. 1 or more)
> > crond (found 1, req. 1 or more)
> > /usr/sbin/sshd (found 1, req. 1 or more)
> > cron (found 1, req. 1 or more)
> > sshd (found 5, req. 1 or more)
> > crond (found 1, req. 1 or more)
> > sshd (found 5, req. 1 or more)
> > crond (found 1, req. 1 or more)
> > sshd (found 5, req. 1 or more)
> > cron (found 1, req. 1 or more)
> > sshd (found 5, req. 1 or more)
> > crond (found 1, req. 1 or more)
> > /usr/sbin/sshd (found 1, req. 1 or more)
> > cron (found 1, req. 1 or more)
> > sshd (found 5, req. 1 or more)
> > cron (found 1, req. 1 or more)
> > sshd (found 5, req. 1 or more)
> > sshd (found 5, req. 1 or more)
> > ns-slapd (found 0, req. 1 or more) These 2 are RED
> > ns-httpd (found 0, req. 1 or more) These 2 are RED
> > cron (found 1, req. 1 or more)
> > sshd (found 5, req. 1 or more)
> > /usr/sbin/sshd (found 1, req. 1 or more)
> > crond (found 1, req. 1 or more)
> > sshd (found 5, req. 1 or more)
> > crond (found 1, req. 1 or more)
> > crond (found 1, req. 1 or more)
> > /usr/sbin/sshd (found 1, req. 1 or more)
> > crond (found 1, req. 1 or more)
> > /usr/sbin/sshd (found 1, req. 1 or more)
> > crond (found 1, req. 1 or more)
> > sshd (found 5, req. 1 or more)
> > PID PPID USER STARTED S PRI %CPU TIME %MEM RSZ VSZ CMD
> > 1 0 root Mar 21 S 24 0.0 00:00:02 0.0 568 4772 init
> > [3]
> > 2 1 root Mar 21 S 139 0.0 00:00:13 0.0 0 0
> > [migration/0]
> > 3 1 root Mar 21 S 5 0.0 00:00:02 0.0 0 0
> > [ksoftirqd/0]
> > 4 1 root Mar 21 S 139 0.0 00:00:19 0.0 0 0
> > [migration/1]
> > 5 1 root Mar 21 S 5 0.0 00:00:03 0.0 0 0
> > [ksoftirqd/1]
> > 6 1 root Mar 21 S 139 0.0 00:00:27 0.0 0 0
> > [migration/2]
> > 7 1 root Mar 21 S 5 0.0 00:00:03 0.0 0 0
> > [ksoftirqd/2]
> > 8 1 root Mar 21 S 139 0.0 00:00:23 0.0 0 0
> > [migration/3]
> > 9 1 root Mar 21 S 5 0.0 00:00:04 0.0 0 0
> > [ksoftirqd/3]
> > 10 1 root Mar 21 S 34 0.0 00:00:00 0.0 0 0
> > [events/0]
> > 11 1 root Mar 21 S 34 0.0 00:00:00 0.0 0 0
> > [events/1]
> > 12 1 root Mar 21 S 34 0.0 00:00:00 0.0 0 0
> > [events/2]
> > 13 1 root Mar 21 S 34 0.0 00:00:00 0.0 0 0
> > [events/3]
> > 14 10 root Mar 21 S 34 0.0 00:00:00 0.0 0 0
> > [khelper]
> > 15 10 root Mar 21 S 24 0.0 00:00:00 0.0 0 0
> > [kacpid]
> > 73 10 root Mar 21 S 34 0.0 00:00:00 0.0 0 0
> > [kblockd/0]
> > 74 10 root Mar 21 S 34 0.0 00:00:00 0.0 0 0
> > [kblockd/1]
> > 75 10 root Mar 21 S 34 0.0 00:00:00 0.0 0 0
> > [kblockd/2]
> > 76 10 root Mar 21 S 34 0.0 00:00:00 0.0 0 0
> > [kblockd/3]
> > 77 1 root Mar 21 S 24 0.0 00:00:00 0.0 0 0
> > [khubd]
> > 112 10 root Mar 21 S 19 0.0 00:00:00 0.0 0 0
> > [pdflush]
> > 113 10 root Mar 21 S 24 0.0 00:01:46 0.0 0 0
> > [pdflush]
> > 114 1 root Mar 21 S 24 0.0 00:00:35 0.0 0 0
> > [kswapd0]
> > 115 10 root Mar 21 S 30 0.0 00:00:00 0.0 0 0
> > [aio/0]
> > 116 10 root Mar 21 S 34 0.0 00:00:00 0.0 0 0
> > [aio/1]
> > 117 10 root Mar 21 S 34 0.0 00:00:00 0.0 0 0
> > [aio/2]
> > 118 10 root Mar 21 S 34 0.0 00:00:00 0.0 0 0
> > [aio/3]
> > 262 1 root Mar 21 S 18 0.0 00:00:00 0.0 0 0
> > [kseriod]
> > 310 32684 hobbit 20:56:55 S 23 0.0 00:00:00 0.0 944 7424
> > /usr/sbin/ntpq -n -c rv xxx.xxx.xxx.xxx
> > 316 32667 hobbit 20:56:55 S 18 0.0 00:00:00 0.0 364 2608
> > /apps/hobbit/client/bin/bb xxx.xxx.xxx.xxx status hobbitserver.mailq
> > green Mon Mar 22 20:56:55 EDT 2010?Mail queue contains 0 requests?
> > 337 32675 hobbit 20:56:55 S 21 0.0 00:00:00 0.0 364 2608
> > /apps/hobbit/client/bin/bb xxx.xxx.xxx.xxx status hobbitserver.sendmail
> > green Mon Mar 22 20:56:55 EDT 2010 <sendmail>?Statistics from Tue Dec
> > 12 04:02:02 2006? M msgsfr bytes_from msgsto bytes_to msgsrej
> > msgsdis msgsqur Mailer? 3 2 2K 0 0K
> > 0 0 0 smtp? 4 16709 82093K 16515 80675K 0
> > 0 0 esmtp? 7 0 0K 4686309 9618429K 0 0
> > 0 relay? 8 0 0K 2 2K 0 0 0
> > procmail? 9 4695543 10022009K 8595 1678935K 0 0 0
> > local?==================================================================
> > ===? T 4712254 10104104K 4711421 11378041K 0 0 0? C
> > 4640434 4695450 0
> > 356 32661 hobbit 20:56:55 S 21 0.0 00:00:00 0.0 364 2608
> > /apps/hobbit/client/bin/bb xxx.xxx.xxx.xxx status hobbitserver.temp
> > green Mon Mar 22 20:56:54 EDT 2010 Temperature status: ?Device Temp(C)
> > Temp(F)
> > Threshold(C)?------------------------------------------------------?&gre
> > en System Board Ambient Temp 22 71
> > 42?------------------------------------------------------?Status green:
> > All devices look okay
> > 385 4788 apache 18:41:00 S 23 0.0 00:00:00 0.0 7836 119508
> > /usr/sbin/httpd
> > 386 4788 apache 18:41:00 S 23 0.0 00:00:00 0.0 7840 119508
> > /usr/sbin/httpd
> > 448 32663 hobbit 20:56:55 Z 23 0.0 00:00:00 0.0 0 0
> > [letstat.pl] <defunct>
> > 505 1 root Mar 21 S 20 0.0 00:00:00 0.0 0 0
> > [scsi_eh_0]
> > 522 32663 hobbit 20:56:55 S 22 0.0 00:00:00 0.0 364 2608
> > /apps/hobbit/client/bin/bb xxx.xxx.xxx.xxx status hobbitserver.network
> > green Mon Mar 22 20:56:54 EDT 2010 - EMS - ?No Recent Network
> > Input/Output Errors detected. ??eth0: autoneg on, 1GB/s-FDX, link
> > ok.?eth1: autoneg on, 1GB/s-FDX, link ok.? ? ?Name Mtu Net/Dest
> > Address Ipkts Ierrs Opkts Oerrs Collis
> > Queue ?lo 16436 loopback loopback 381956450
> > 38195645 0 0 0 ?eth1 16896 hobbitserver
> > hobbitserver 5078278 0 2000647 0 0 0
> > ?eth0 16896 hobbitserver hobbitserver 203321736 0
> > 135196037 0 0 0 ?
> > 556 1 root Mar 21 S 24 0.0 00:01:29 0.0 0 0
>
xx SNIP xx
> To unsubscribe from the hobbit list, send an e-mail to
> > hobbit-unsubscribe at hswn.dk
> >
> > To unsubscribe from the hobbit list, send an e-mail to
> > hobbit-unsubscribe at hswn.dk
Hi Jim,
Thanks for the response. That was the first place I checked, but no luck. The Hobbit server is all by itself,
no includes, no pages, no wildcards.
The URL http://hobbitserver/hobbit-cgi/bb-hostsvc.sh?HOST=hobbitserver&SERVICE=procs uses
hobbitsvc.cgi and the man page states: "hobbitsvc.cgi is a CGI program to present a Hobbit status log in HTML
form" . I went through all the logs and deleted "procs" for hobbitserver, but no luck. I also looked to see if
somehow the logs had the same inode, no luck.
This is driving me crazy
Thanks again
Gregory R Shea
EMC Corporation
More information about the Xymon
mailing list