[gclist] When to garbage collect

Hans Boehm boehm@hoh.mti.sgi.com
Fri, 14 Nov 1997 12:08:56 -0800

On Nov 14,  3:15pm, Fridtjof Siebert wrote:
> Here, the amount of work done by the gc on one allocation is bound by a
> small constant if the mutator guarantees never to use more than a
> certain percentage of the available memory (while this percentage does
> not have to be known in advance).
I think you meant the amount of GC work done per allocated byte, not object.

This is the crucial property that needs to be guaranteed by a collection
heuristic.  It follows roughly that if at all possible, you should avoid
collecting unless the amount of allocation you have done since the last
collection justifies the GC effort.  For a nongenerational collector, the GC
effort is basically proportional to the amount of live data, which is bounded
by the heap size.  Thus the required amount of allocation between collections
must grow in proportion to live data or heap size.  Otherwise there will be
scenarios in which an excessive amount of time is spent in the collector.

I emphasize this because nearly all of the Java collectors I have tried seem to
have gotten it wrong, and collect much too frequently in at least some cases.
 This seems to be the most significant GC performance problem in the Java

As Charles points out, this should probably be subject to the constraint that
you don't cross the paging threshold unless you have to.  But I think that
should be a limiting tweak to the above heuristic, not the other way around.  I
don't want a little garbage collected utility program that displays in the
corner of my screen to use up as much memory as it can without paging.

For generational GC, this gets trickier.  I believe it's still crucial to bound
the amount of GC effort per allocated byte, especially in the case of 100%
survival rates for the youngest generation(s).  That may have been anomalous
for Lisp machines.  It doesn't seem to be that unusual for some phases of
programs that I see.


Hans-Juergen Boehm