[gclist] What does a garbage collector do about

Boehm, Hans hans_boehm@hp.com
Wed, 31 Jul 2002 17:22:40 -0700


I've arrived at the conclusion that C++ style destruction and finalization are mostly
orthogonal facilites.  We're probably identifying them too much in this discussion.

The file descriptor example has the problem that depending on the context, either facility may be useful.

If you understand precisely when an object is going to be destroyed, clearly C++-style destruction has only advantages, and it makes sense to rely on context in the destructor.

But we use garbage collectors because it's often useful to abstract away those issues because it's hard to determine when an object can be reclaimed and/or destroyed.  In that context, clean-up actions must be run asynchronously, so that they can safely acquire the locks needed to ensure that they don't interfere with other code.  (The fact that tracing collectors also take a while to identify garbage is nearly irrelevant.)  That means they're more restricted in what they can do.  But they're still useful.  For example, they can usually be used to clean up temporary files, remove state associated with an object that happens to be stored in a global table, etc.

A good example to illustrate that synchronous destruction is not sufficient are "ropes" in the Xerox Cedar environment.  These are used as the standard string representation.  But a "rope" may contain an embedded file as part of its representation.  (Thus ropes do not necessarily have to fit in main memory.)  This means that any data structure containing a string can potentially have an embedded file descriptor.  If you had to understand exactly when such a file needed to be closed, you'd have to basically redo the garbage collector's work.  I have no trouble believing that this doesn't interact will with close routines that need to renew a file lease.  It also doesn't work well if the number of file descriptors embedded in ropes is large compared to the file descriptor limit.  In those cases, you shouldn't use this facility.  But there are certainly applications for which it is useful.

We all agree that file descriptors should be explicitly closed (or closed by a C++-style destructor for an object with a known lifetime) whenever possible.  Whether or not it's safe to rely on finalizers for the hard cases depends on several things:

- The file descriptor limit.

- Whether the file open routine tries to run the GC and finalizers

- The number of file desscriptors that might have to be reclaimed this way.

- Whether the close routine needs more context, etc.

Hans