[gclist] Guardians

Carl Bruggeman bruggema@ranger.uta.edu
Wed, 16 Apr 1997 13:28:42 -0500


Hans Boehm wrote:
> Finalization is needed rarely enough that any of these
> mechanism are sufficient in practice.  Further, they all prevent
> dangling pointers for pure ly garbage collected code, which I think
> is the next most important criterion in evaluating finalization
> mechanisms.

Agreed.

> However, beyond that, it still seems to me that topologically ordered
> finalization is a clear win over the alternatives.  The reason I'm pushing on
> the point, in spite of the fact that I don't believe it's among the most
> important language design decisions, is that I was hoping it's one of those
> rare opportunities to prune the design space.

Partially agreed.  Opportunities to prune the design space should be
pursued with vigor because they are often rare and, as I mentioned in
my previous message, additional restrictions usually mean additional
opportunities to help out the programmer.  I believe that we differ in
opinion primarily about language features that are so restrictive that
they require "escape" mechanism in exceptional circumstances.  I tend
to favor less restrictive mechanisms when there are realistic programs
that require such escape mechanisms (as there is for finalization
order).

> Timely finalization is elusive in either scheme.  It's hard to
> define what you mean by timely in a world in which that object in
> question may have been promoted to a generation that will be
> collected next week.

Here we fundamentally disagree.  Due to this discussion, I have begun
to recall some our design decisions from '93, and I now remember that
we briefly considered what we are now calling "topological" finalization. 
It was precisely due to the issue of timely finalization that we
dismissed the idea of topological finalization -- and it was
precisely because we have a production quality generation-based
collector and the bad interaction between topological finalization and
generation-based collectors.

Topological finalization of a chain of N dependant objects requires N
collections in a simple semi-space copying collector to finalize the
entire chain of objects, while, in the same system, guardians requires
a single collection.  In a generation-based collector N is the _lower
bound_ on the number of collections required for topological
finalization.  Since each collection potentially promotes live objects
to older generations, and objects in older generations are, by
definition collected less frequently, topological finalization
guarantees that, the farther down the chain an object is, the older
the generation it will end up in, and the longer it will take to be
collected, because, again by definition, a generation-based collector
will do several younger generation collections for each of the older
generation collections.  For a chain or cycle of guardians, in which
the oldest generation of any of the objects is G, finalization will
take at most 1 collection of generation G.
 
> People don't build lists of a billion finalizable objects.  Linear
> lists of that length are rarely a reasonable dat a structure.  

I think "billion" was never meant to be considered seriously.  In a
system like ours with 4 generations, however, it would only take a
chain of length 4 with topological finalization (even if all the
objects were initially in the youngest generation) to delay the
finalization of the last object in the chain until a generation 4
collection (which may indeed be once a week).  In this example using
guardians, however, the entire chain would be finalized in the first
collection.  If the oldest object in the finalization cycle is in
generation G, it is in keeping with the generational assumption (that
newly allocated objects tend to die young and that objects pointed at
by older objects have life-spans proportional to that of the older
object) that the cycle would not be collected and finalized until
generation G.  In essence, topological finalization violates the
observed property of most programs that generational collectors are
based on, by forcibly promoting objects in finalization chains to older
generations in order to get topological finalization.  Thus, we
concluded, that generation-based collectors and topological
finalization were fundamentally incompatible. 

The next obvious thing to consider is:  how much is timely
finalization worth?  We decided that unless a programmer has some
guarantee about when finalization occurs, then the finalization
feature is essentially useless.  Guardians give the same guarantee
that the programmer has about when memory is freed by the collector
and guardians are affected by the same mechanisms that the programmer
has to force collections of particular generations at particular
times.  With topological finalization the guarantees are data
dependent (i.e.  they depend on how long the finalization chains are,
which can vary greatly from run-to-run), and, as such, are very
difficult for the programmer to reason about in general.

There are a several other things from your message that I will comment
on later, but I already used up my quota of time for mailing lists for
this week at some point last week...

Carl
--------------------------------------------------------------------------------
Carl Bruggeman -- bruggema@cse.uta.edu -- Phone: (817) 272-3600 Fax: 272-3784
Computer Science and Engineering Department -- University of Texas at Arlington