[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

procs monitor size fluctuations



Hi,

I started seeing weird behavior with one of my Hobbit clients 2 days ago and it's got me stumped. What's happening is that my 'procs' monitor is changing from red to green every polling cycle. When it is red, the monitored process list shows about half my monitored processes are down. However, doing a 'ps' on the client itself shows that these processes are actually running. I noticed that during the red cycles, the active process list is consistently about 100 lines shorter than when the procs monitor is green, so the Hobbit server is marking the processes down since they aren't being reported by the client. It doesn't look like the client messages are being being truncated by the server(no messages regarding truncation in the hobbitd logs anyway), but I raised the MAXMSG_STATUS and MAXMSG_CLIENT on the server, with no obvious effect. I also don't see any network issues between to the client and the server, and no extreme loads on either. I'm stuck....

BTW - I'm running the 4.2.0 software. The client is Solaris 10, the server is SuSE Enterprise 9.4.


This is a small snippet, taken from my hobbit server, First, from the histlogs directory. You can see the size of the file is bouncing back and forth between 27186k and 16714k:

-rw-r--r--   1 hobbit users 27186 Apr  9 16:35 Thu_Apr_9_16:35:13_2009
-rw-r--r--   1 hobbit users 16714 Apr  9 16:39 Thu_Apr_9_16:39:10_2009
-rw-r--r--   1 hobbit users 28108 Apr  9 16:40 Thu_Apr_9_16:40:11_2009
-rw-r--r--   1 hobbit users 16714 Apr  9 16:44 Thu_Apr_9_16:44:14_2009
-rw-r--r--   1 hobbit users 27186 Apr  9 16:45 Thu_Apr_9_16:45:09_2009
-rw-r--r--   1 hobbit users 16714 Apr  9 16:49 Thu_Apr_9_16:49:17_2009

And the corresponding bit from the hist directory shows the flapping between red and green:

Thu Apr  9 16:35:13 2009 green 1239309313 237
Thu Apr  9 16:39:10 2009 red 1239309550 61
Thu Apr  9 16:40:11 2009 green 1239309611 243
Thu Apr  9 16:44:14 2009 red 1239309854 55
Thu Apr  9 16:45:09 2009 green 1239309909 248
Thu Apr  9 16:49:17 2009 red 1239310157 55

This is a very small example of what's happening, but it's been happening with the same regularity for the 2 days now.
Does anyone have a clue what might be happening?

Thanks,
Gus