[gclist] gc interface

boehm.PARC@xerox.com boehm.PARC@xerox.com
Sat, 23 Mar 1996 11:28:10 PST


Re: formal methods for meta GC:

It's clear that formal methods would need to be part of the answer.  But it
still seems to me you're embarking on a very ambitious program.  I don't know
of any exisitng performance-competitive GC that has been formally verified.
The high-level algorithms often have at lerast an informal correctness proof,
and that should certainly be the case.  But there tends to be a big gap between
those proofs and the implementation.  And the implementation often lives on an
OS interface that has a poorly-written, very informal specification.  And the
implementation is ceratinly not formally verified.

To make life harder, a lot of the facts that are relevant for picking a GC
(e.g. object lifetimes, data structure shapes) are very global in nature, and
tend to be hard to infer automatically.  That's especially true if you don't
have complete information about the program, due to separate compilation,
third-party libraries, etc.

Re: running out of memory:

This is probably not an insolvable problem, but it makes life harder and the
collector more complex.  The problem is all in the collector implemenatations.
The problem is that you will most likely discover that you are out of memory
during a garbage collection, either when you run out of room for the mark
stack, or out of room in to-space for a copying collector.  This normally
leaves the program in an inconsistent state in which it's very hard to do
things like deliver an exception.  There are 2 ways to avoid this problem:

1) Reserve worst-case memory requirements in advance.  This tends to be
unpleasant because it's overly space inefficient.  For example, I believe the
SRC M3 run-time system very consciously doesn't do this.  Neither do we.  (It's
of course the right solution for safety-critical real-time systems, etc.  But
fortunately not all of us write those.)

2) Add a fall-back algorithm that uses less space, but usually more time.
There are several mark-sweep collectors (including ours) that do this by
reverting to an algorithm that scans the heap repeatedly as necessary.  But
this imposes some interesting constraints on the collector(s) that have to play
along with such a scheme.

I'm not arguing that this is a bad research direction.  I'm arguing that it's
not obvious whether or not it will succeed, in the sense of generating
practically useful results.  And getting it to succeed requires insights I
don't yet have.  That probably makes it a good research direction, but a bad
product idea.

Hans
Standard disclaimer ...