[2] [gclist] finalization

Barry Hayes bhayes@parcplace.com
Monday, 19 May 1997 11:28:51


Well. First off, there are at least two different animals going under the name
of "finalization." We all know this, but it's part of the gory detail.

In "container-based" finalization, there are special kinds of containers, like
weak-arrays, or special kinds of pointers, like weak-pointers. An object is
finalized when all pointers to it EXCEPT those pointers that are special [by
dint of being part of a special container, or otherwise marked as a special
pointer] have gone away. 

[This is a re-hash of what David Chase explained, and I use some personal
buzz-words.] In container-based finalization, a collector would first trace all
the pointers, except for the weak pointers, and keep track of the weak pointers
it has encountered in some kind of delay-queue, an internal structure to the
GC. After all the "strongly reachable" objects have been traced, the
delay-queue gets scanned. Any objects that are reachable from the delay-queue's
pointers but haven't yet been marked are "almost collectable". 

Semantics of finalization branch off a lot here. Typically, in container-based
finalization, the containers get notified about an object being "almost
collectable." Some systems zap the weak pointers to "almost collectable"
objects, to make sure that the objects aren't "resurrected". Others trace
through these objects and keep them alive. If some "almost collectable" objects
are reachable from others, some systems will only finalize those that aren't
pointed to by others. The forms of the notification are varied, and so on.
David's certainly right about "no consensus." [And what if you encounter more
weak containers in that second part of the trace? Notify them or not?]

In "instance-based" finalization, there's something special about the object.
Java finalization is like this. Think of it as a bit in the header of the
object. The collector does a trace from the roots through all the reachable
objects. At the end of the trace, it has to find all finalizable objects that
haven't been traced. It could scan memory, but typically, it just maintains a
structure private to the GC that is a collection of all the objects that are
"enabled for finalization." [In Java, this means that they have a  finalization
routine other than that inherited from Object.] Rather than scanning all of
memory, the GC just scans this list for objects not yet seen, and send them a message.

Note that instance-based finalization is, in essence, implemented with a hidden
container-based finalization scheme, but the sole weak container is a GC
internal structure. And some instance-based scheme also have some kind of weak
pointers, just to make life difficult.

What it comes down to is that it's very hard to tell you what the gory details
are, except on a case-by-case business. It's rather like you had asked "how do
I implement multi-threading". Well, it ends up to depend a lot on many other
decisions in your system.