[gclist] Buffy finalizer slayer.

Nick Barnes Nick.Barnes@pobox.com
Fri, 11 Jun 1999 22:50:00 +0100


At 1999-06-11 16:55:08+0000, David Chase writes:
> At 04:35 PM 6/11/99 +0100, Nick Barnes wrote:
> 
> >Finalization is much better if you have exact collection and a
> >"collect the whole world now" function.  Then the mutator stands a
> >better chance of telling that an object has a unique pointer according
> >to language semantics.  But exact collection is a somewhat tricky
> >notion with optimizing compilers.  With enough information from the
> >compiler, you may be able to avoid looking at any registers, stack
> >slots, or object slots that contain non-pointers, but it's much harder
> >to avoid looking at values (registers and stack slots in particular)
> >that contain values which might be no longer in use according to the
> >language semantics.
> 
> It depends upon how you define "optimizing", of course, but we
> (my employer, NaturalBridge) have done this.  The compiler emits
> exact maps for all objects and for all points where the GC may
> run, and takes into account the liveness information computed
> by data flow analysis for registers and stack slots.  Eliot Moss
> and Ole Agesen have done the same thing, I believe, for Java bytecodes,
> and before that Eliot Moss, Amer Diwan, Rick Hudson, and company
> went some distance towards implementing the same thing for Modula-3.

Sure, and various people have done the same thing in various ways for
SML.  It's even led to some interesting research (I'm thinking here of
region inference).  But it generally doesn't get you all the way
there.  There tends to be some corner in which all you can say is
"this value escapes our local context", and unless you have global
information, which you rarely do, you have to say "and must therefore
be assumed to be live".  Ultimately, even with global information, it
boils down to the halting problem.  The programmer may know very well
that a value is not going to be used again, and simply not have
expressed this fact in a way which survives the earlier stages of the
compiler.

> However, it is absolutely true that this depends on whole-world
> collection; a generational collector won't get everything during
> a partial collection.  The Big Problem here is that we have seen
> (as in, just yesterday, fiddling with a benchmark) a factor of 40
> reduction in GC cost going from copy-the-world to generational
> collection, which in turn doubled the overall performance of the
> program.

Right.  For languages which really need garbage collection (and most
of my experience here is with SML, which really, _really_ needs
garbage collection), you need good generational collectors, and then
even if you have exact collection finalizers can only be trusted to
run when the user says (or the opportunistic code infers) "OK, I'm
going for coffee, really collect the world now".

Nick B