procs monitor size fluctuations

Gus Ferrer gus.ferrer at mail.cuny.edu
Thu Apr 9 23:24:55 CEST 2009


Hi,

I started seeing weird behavior with one of my Hobbit clients 2 days  
ago and it's got me stumped.  What's happening is that my 'procs'  
monitor is changing from red to green every polling cycle.  When it is  
red, the monitored process list shows about half my monitored  
processes are down.  However, doing a 'ps' on the client itself shows  
that these processes are actually running.  I noticed that during the  
red cycles, the active process list is consistently about 100 lines  
shorter than when the procs monitor is green, so the Hobbit server is  
marking the processes down since they aren't being reported by the  
client.
It doesn't look like the client messages are being being truncated by  
the server(no messages regarding truncation  in the hobbitd logs  
anyway), but I raised the MAXMSG_STATUS and MAXMSG_CLIENT on the  
server, with no obvious effect.  I also don't see any network issues  
between to the client and the server, and no extreme loads on either.   
I'm stuck....

BTW - I'm running the 4.2.0 software. The client is Solaris 10, the  
server is SuSE Enterprise 9.4.


This is a small snippet, taken from my hobbit server,  First, from the  
histlogs directory.  You can see the size of the file is bouncing back  
and forth between 27186k and 16714k:

-rw-r--r--   1 hobbit users 27186 Apr  9 16:35 Thu_Apr_9_16:35:13_2009
-rw-r--r--   1 hobbit users 16714 Apr  9 16:39 Thu_Apr_9_16:39:10_2009
-rw-r--r--   1 hobbit users 28108 Apr  9 16:40 Thu_Apr_9_16:40:11_2009
-rw-r--r--   1 hobbit users 16714 Apr  9 16:44 Thu_Apr_9_16:44:14_2009
-rw-r--r--   1 hobbit users 27186 Apr  9 16:45 Thu_Apr_9_16:45:09_2009
-rw-r--r--   1 hobbit users 16714 Apr  9 16:49 Thu_Apr_9_16:49:17_2009

And the corresponding bit from the hist directory shows the flapping  
between red and green:

Thu Apr  9 16:35:13 2009 green 1239309313 237
Thu Apr  9 16:39:10 2009 red 1239309550 61
Thu Apr  9 16:40:11 2009 green 1239309611 243
Thu Apr  9 16:44:14 2009 red 1239309854 55
Thu Apr  9 16:45:09 2009 green 1239309909 248
Thu Apr  9 16:49:17 2009 red 1239310157 55

This is a very small example of what's happening, but it's been  
happening with the same regularity for the 2 days now.
Does anyone have a clue what might be happening?

Thanks,
Gus



More information about the Xymon mailing list