[hobbit] Loss of Apache graphs

Thomas tlp-hobbit at holme-pedersen.dk
Thu Feb 23 16:45:37 CET 2006


Yes thats right, mine are usually around 2 GB when I get into problems, 
so this is not it. Also I done have this in my logs

2006-02-09 15:24:36 Tried to down BOARDBUSY: Invalid argument
2006-02-09 15:31:54 Could not get shm of size 262144: No such file or 
directory
2006-02-09 15:31:54 Channel not available

so something else is wrong.

Rob Munsch wrote:
> ... are these not the rrd logs you meant..?  Can't really find any 
> others... also, if it's that, why is only one host affected?
>
> Rob Munsch wrote:
>
>> Hmm.
>>
>> Well, there are two webservers; one is showing apache graphs, the 
>> other isn't.
>> Went ahead and added 'em to a (pretty aggressive) rotation schedule; 
>> rrd-data.log is 600k, rrd-status was about 3.5M.  Just in case, 
>> rrd-status is now limited to 1M.
>>
>> Stopped and restarted the server, but no apparent effect.  The server 
>> that had its Apache graphs still does, and that one that doesn't, 
>> doesn't.
>>
>> Here are some recent rrd log entries, if that sheds any light.  "Mo" 
>> is the server with the graphs, "ws-1" is the one without:
>>
>> rrd-data.log
>>
>> 2006-02-03 02:55:17 RRD error updating 
>> /home/hobbit/data/rrd/ws-1/apache.rrd from 10.10.10.47: illegal 
>> attempt to update using time 1138953317 when last update time is 
>> 1138953317 (minimum one second step)
>> 2006-02-03 02:55:17 RRD error updating 
>> /home/hobbit/data/rrd/mo/apache.rrd from 10.10.10.47: illegal attempt 
>> to update using time 1138953317 when last update time is 1138953317 
>> (minimum one second step)
>> 2006-02-03 02:58:25 RRD error updating 
>> /home/hobbit/data/rrd/ws-1/apache.rrd from 10.10.10.47: illegal 
>> attempt to update using time 1138953505 when last update time is 
>> 1138953505 (minimum one second step)
>> 2006-02-03 02:58:25 RRD error updating 
>> /home/hobbit/data/rrd/mo/apache.rrd from 10.10.10.47: illegal attempt 
>> to update using time 1138953505 when last update time is 1138953505 
>> (minimum one second step)
>> 2006-02-09 04:04:12 Could not get shm of size 262144: No such file or 
>> directory
>> 2006-02-09 04:04:12 Channel not available
>> 2006-02-09 15:24:36 Tried to down BOARDBUSY: Invalid argument
>> 2006-02-09 15:31:54 Could not get shm of size 262144: No such file or 
>> directory
>> 2006-02-09 15:31:54 Channel not available
>> 2006-02-16 11:18:57 Tried to down BOARDBUSY: Invalid argument
>> 2006-02-16 11:18:57 Worker process died with exit code 0, terminating
>> 2006-02-22 11:41:59 Tried to down BOARDBUSY: Invalid argument
>> root at randomaccess /var/log/hobbit #          
>>
>> rrd-status.log (the former 3.5M log - current is empty file with no 
>> entries post-rotate)
>>
>> 2006-02-09 15:24:36 Tried to down BOARDBUSY: Invalid argument
>> 2006-02-09 15:31:54 Could not get shm of size 262144: No such file or 
>> directory
>> 2006-02-09 15:31:54 Channel not available
>> 2006-02-09 22:25:38 RRD error updating 
>> /home/hobbit/data/rrd/randomaccess/bbgen.rrd from 10.10.10.47: 
>> illegal attempt to update using time 1139541938 when last update time 
>> is 1139545383 (minimum one second step)
>> 2006-02-16 11:18:57 Tried to down BOARDBUSY: Invalid argument
>> root at randomaccess /var/log/hobbit #
>>
>> Not sure what's going on here.
>>
>> Thomas wrote:
>>
>>> Hi Rob,
>>>
>>> I dont know if this can help you but every time I have had problems 
>>> with missing graphs its been because the rrd logfiles were too big.
>>>
>>> Just a info..
>>>
>>> /Thomas
>>>
>>> Rob Munsch wrote:
>>>
>>>> Hello,
>>>>
>>>> There are two webservers being monitored by hobbit (among many 
>>>> other different servers).
>>>> Both have bb-hosts entries that are nearly identical.  Both have 
>>>> the same version of the client on them (4.1.2p1).  Both seem to be 
>>>> working perfectly well in all other respects - both internal (CPU, 
>>>> disk etc) and external (conn, http) tests seem to be working, and 
>>>> have graphs.
>>>>
>>>> However on one, the apache trends show up as expected, and on the 
>>>> other, they have stopped graphing.  Current values for the 
>>>> graphless one are good ol' "nan," but *just* for the 4 apache trend 
>>>> graphs - Utilization, Workers, CPU Ut and RPS.
>>>>
>>>> All other trend graphs are there.
>>>>
>>>> Historical data for before the sudden loss of graphing is there 
>>>> (i.e., about a week ago the graphing stopped - 12 day graph shows 
>>>> data before this cutoff).
>>>>
>>>> Nothing has changed, been added, or modified as far as i can tell.
>>>>
>>>> What am i missing..?
>>>>
>>>> Thanks!
>>>>
>>>
>>>
>>> To unsubscribe from the hobbit list, send an e-mail to
>>> hobbit-unsubscribe at hswn.dk
>>>
>>>
>>
>>
>
>



More information about the Xymon mailing list