[gclist] Finalization and object orientation.

Hans Boehm boehm@hoh.mti.sgi.com
Sat, 29 Mar 1997 09:59:45 -0800

On Mar 28,  6:30pm, Eliot & Linda wrote:

[I agree with much of what was said.  I've eliminated most of that from
this message.]

> What I think we're converging on is that if a language hopes to provide
> a reliable finalization facility then it must define variable liveness.

I agree.  I suspect we still disagree a bit about the necessity for this form
of "reliability".  In my experience the "unreliable" version of the facility is
still quite useful, for example to clean up the occasional dropped file handle.
The typical UNIX system allows hundreds (often thousands if you go out of your
way to increase the per process limit) of open files.  (Linux says 256, Irix
says 200, Solaris says 64 by default.  Both Irix and Solaris let me say "limit
descriptors 1000" to raise it to 1000.)  For most programs that means that
leaking a descriptor or 2 is not a disaster.  If you are stuck with a small
number of file handles, you probably want to virtualize them anyway, putting
yourself in the same position.

I don't know of any code that doesn't know whether a file handle is associated
with a window or not.  If it is associated with a window, odds are the program
"knows" when the window should be closed, and you only want to use finalization
to close the window in an error recovery path.  If it accidentally stays open,
that's probably the least of your worries.

Cedar uses this notion of "unreliable" finalization.  The philosophy was to
close files when convenient, and let the finalizer do it when it was
inconvenient (e.g. on an exception path, or of the file handle was embedded in
a complex data structure).  That seemed to work reasonably well.  (I think I
remember one problem triggered by the fact that open did not ever force a
garbage collection.  It should have.  I don't remember any others.)

> I wrote:
> > If we pick a definition, we can then ask about the cost of sticking to it
> > and guaranteeing that memory usage will never exceed the total reachable
> > storage size by more than a constant, and objects become finalizable as
> > expected.  I would guess the cost varies a lot, depending on the
> >  My guess is that the cost of clearing variables when a scope is exited is
> > huge, but significant.
> In a traditional stack implementation it may be very cheap, since the
> stack trace
> is bounded by the base of the stack and the stack pointer.  But you're
> right
> that there are hidden costs because one may have to nil memory on
> function entry.
> Tricky.  And in a concurrent collector I can see that one is faced with
> serious
> difficulties.
This problem could also be solved with static descriptors of stack frames,
at some space cost.  The space cost seems to be mainly an issue with a
concurrent collector that requires a description for every possible PC value,
instead of just at call sites.

Keeping dead variables around is much more serious.  Running out of registers
is less of a problem than the fact that you have to save those registers across
procedure calls.

There is also code that runs out of 32 registers, especially if your compiler
unrolls loops to be able to schedule aggressively.  My guess is also that there
are many (very) small functions on Intel hardware that would normally keep
locals in registers.  Keeping dead pointers around would force stores of the
pointers.  I don't have measurements, however.

> > My impression from Andrew Appel's group is that maintaining asymptotic
> > complexity (with the CPS definition of reachability) is neither free nor
> > intolerably expensive.  It does inhibit some optimizations, and make it
> > harder to get the compiler right.
> That makes sence to me.  Any references?

I think his book says a little about that.  My information comes mainly from
Zhong Shao.

The reference I promised earlier for deducing liveness information from ML
types was:

Pascal Fradet, "Collecting More Garbage", L&FP 94, pp. 24 - 33.


Hans-Juergen Boehm