[gclist] Precise GC's performance [really finalization protocol]
David Chase
chase@centerline.com
Wed, 06 Mar 96 14:43:38 EST
> From: Mark Tillotson <markt@harlequin.co.uk>
>
> I don't believe so---IF the cycle is purely between the finalizable
> objects, AND if none of them are revived during the "kill methods", then
> the next time round the GC they are no longer on the list of
> finalizable objects, and hence form a detached cycle unreachable from the
> roots.
> Does this clarify things?
Not really. Perhaps I don't understand how you intend it to work.
I imagine two finalization policies:
1. containers (roots) are always finalized before containees (interior
nodes), which are not finalized if the containers revive themselves.
That is, you start finalizing at the various roots of the garbage
graphs, and either
(a) use a write barrier so you can know whether an object has
revived itself, OR
(b) finalize just "one deep" into the graph, and wait to see whether
those objects contained by the garbage roots are new "garbage
roots" after the next collection.
Note that this policy is stuck, in the face of cycles, because
there is no root to the garbage. If you choose one arbitrarily,
you can end up with the same problem faced by choice #2.
2. everything unreachable, is finalized, in no particular order.
This deals with cycles, but because the garbage collector's view
of your program (a bunch of linked blobs of memory) may be unrelated
to your view of the program, it means that the finalizer has to
be written with very few assumptions about the state of a components
of an object. What is conceptually "one thing" to a programmer
(e.g., a splay tree, or a hash table) might look like many random
pieces of memory to the GC. This seems like a mess to me.
What is especially obnoxious in either case is how you deal with
refinalization. If you explicitly register an object for finalization
(in the style of Modula-3, for instance) then you can detect what
objects are eligible for future finalization, but if finalization/destruction
is instead type-linked (as in Java or C++), and you use a weak-pointer-based
concurrent finalization (I think this is a sensible assumption) then,
lacking a write barrier, you have no way of knowing (at a future
garbage collection) whether an object on the finalization queue stayed
garbage (thus needing no further finalization) or was revived, and
then became garbage again (thus, it "should" have its finalizer run).
Java deals with this, at least in the versions of the language that
I've read about, by declaring that finalizers are run once, full
stop. I checked to see if this is really what was intended, and
it really is. Obviously, this has implications for what code you
put in a finalizer, though I don't think they made this point in
the version of the language spec that I read. (This, by the way,
is what I regard as the major problem for use of GC with legacy C++
code -- what do you do with the destructors? How do they get run
in the first place, and how do you deal with the re-running problem
in cases like this?)
David Chase