[gclist] When to garbage collect

Fridtjof Siebert siebert@gr.opengroup.org
Fri, 14 Nov 1997 15:15:52 +0100


JanFriso.Groote@cwi.nl wrote:
> 
> I would like to know whether there exist references to papers
> dealing with when to really do a Mark-Sweep garbage collection.
> There are many moments when the system can decide to garbage
> collect. However, to reduce overhead, a rule is used to
> determine whether it is worthwhile to go through all the memory.
> A (particularily bad) rule that I encountered in some
> program is that only if less than 5 percent of the declared
> memory is free, (i.e. not used, and no garbage), then
> the process collects the garbage. 

It is important to know what _bad_ means for your. If you want to
minimize the total amount of time spend garbage collecting, it is best
to do as few collections as possible and to to them as late as possible.
So collect only when the available memory is exhausted.

> A far improved
> (and self invented) rule is that the garbage is collected
> if there has been a growth of say 5% of the total
> memory, with respect to the last garbage collection.

This scheme certainly does not improve the amount of time spend garbage
collecting, but it might be a reasonable approach to keep the gc'ed
program from eating up all the system resources.

> As I would expect such rules to be described in the literature,
> I would very appreciate if somebody could provide me with pointers.

For the amount of work that an incremental collector should do per
allocation, I examined a simple solution as part of my final thesis (in
German):
http://www.informatik.uni-stuttgart.de/cgi-bin/ncstrl_rep_view.pl?/inf/ftp/pub/library/medoc.ustuttgart_fi/DIP-1484/DIP-1484.bib. 

Here, the amount of work done by the gc on one allocation is bound by a
small constant if the mutator guarantees never to use more than a
certain percentage of the available memory (while this percentage does
not have to be known in advance).

Cheers, 

Fridtjof.