[gclist] Finalization and the insane postman bug.

Boehm, Hans hans_boehm@hp.com
Thu, 4 Oct 2001 16:48:37 -0700


> -----Original Message-----
> From: rog@vitanuova.com [mailto:rog@vitanuova.com]
> 
> this discussion makes me quite glad that we haven't got generalised
> destructors in Limbo; it uses a combination of refcounted and
> mark&sweep GC (if garbage isn't cyclic then it goes away immediately),
> and destructors are available to the Inferno kernel, which uses them
> in a limited way to free some kernel-level resources (e.g.  open
> files, windows).  these destructors run synchronously, but as they're
> a well defined finite set, we don't come across the same problems that
> Hans talks about.

The question in my mind is not so much whether you've come across the
problem, but whether you can write down a set of programming rules that
ensure correctness.

It seems to me that with this approach you effectively need a rule that
ensures that if a finalizer for type T needs lock L (or touches a data
structure that is logically protected by L), then no thread may execute a
pointer assignment that may possibly result in a T being dropped while
holding L.  This sounds pretty much intractable to me.  In general, it's
very difficult to predict what kinds of objects may become inaccessible as a
result of a pointer assignment.  And that's a property of the whole program,
not one respects module boundaries.  I need to know what the types of all
objects are that could possibly be uniquely referenced by a particular
object.

You could try to enforce the rule that you may not assign pointers to
objects transitively referring to unknown objects while holding a lock.  But
I would be very suprised if you could preserve that property in a large
system.
> 
> do we really need general finalisers?
> 
In my opinion, yes.  You clearly need them in a system that includes a
legacy library requiring explicit deallocation/object destruction.  Objects
referenced by such a library may in general be embedded inside an arbitray
GC-managed data structure, and you somehow need to manage resources inside
the legacy library.

Another good example is the "rope" data type that's included in the SGI and
g++ versions of the C++ STL.  This was based on the identically named data
type in the Xerox Cedar environment.  It represents (mostly immutable)
character strings as trees, making many operations on long strings much more
efficient.  It's very useful to support string pieces whose characters are
computed lazily (but consistently!) by a piece of arbitrary code.  Typically
that code might lazily read characters from a file.  When that particular
substring is no longer relevant, any state associated with that code, e.g.
the file descriptor, should be suitably destroyed.  This is one way to, for
example, very cleanly implement a text editor that can efficiently edit
files much larger than physical memory.

This works correctly with finalizers.  (It works especially well in the
common case in which a small number of descriptors are embedded in ropes.
If there are many such embedded descriptors, then the file open procedure
needs to know how to invoke the collector and finalizers if it's out of
descriptors.  But it should do that anyway.)  It potentially runs into
precisely the above locking problems if you try to destroy the state
synchronously.  In general it's very hard for the client programmer to know
when the state will no longer be needed, since it may depend on exactly
which pieces of the original string are still in use at a particular point.
(I can concatenate the file with a bunch of string constants, and then keep
only a small substring of the result, which may or may not include a piece
of the file, and the implementation may or may not chose to replace a
sufficiently short file section by an explicitly stored string in memory.)

I think the rope case is also illustrative in that it usually works with
deterministic/synchronous destruction.  That's what the C++ implementation
does.  But that's only true because in common cases, like closing a file,
the destructor requires no use of user visible locks.  But that means it's
not general, and the system can't be expanded in certain ways without a
fundamental redesign.  That's not a nice property.

Hans