[gclist] The old container problem

David Chase chase@world.std.com
Tue, 22 Sep 1998 11:55:00 -0400

>At 1998-09-21 16:26:51 UT, David Gadbois writes:
>> 2. Allocate the containers in a special "pre-natal" area that is
>>    younger than the youngest normal generation.  Collect this area by
>>    doing a straight non-generational GC at fairly infrequent
>>    intervals.  The downsides to this approach are that it requires
>>    what amounts to an extra, special-purpose collector and that it
>>    complicates already fragile storage system invariants.

At 01:32 PM 9/22/98 +0100, Nick Barnes wrote:
>Something like this is a good technique if your collector is
>sufficiently flexible to make it possible.  You shouldn't have to have
>an extra collector; in particular note that collection policy (when
>and how often a given part of the heap is collected), promotion policy
>(when and whether parts of the heap are promoted to older
>generations), and maintaining graph invariants are theoretically
>independent of each other, and a good GC will offer the client total
>control over the first two, while having a number of approaches to the

>This isn't rocket science; I suggest that all GC implementors consider
>how best to achieve this sort of flexibility.  Hard-wiring collection
>and promotion policies is as unnecessary and inflexible as hard-wiring
>the number of generations or the maximum heap size.

My knee-jerk reaction to this is "oh dear, more knobs".  I've
become very suspicious of knobs ever since working on compilers
at Sun, where several more experienced people pointed out
that adding one compiler option doubles the testing load, and
more-or-less proved their point by provoking numerous bugs
with randomly chosen compiler flags.  It's a safe bet that
"if it isn't tested, it doesn't work", and adding knobs adds
lots more stuff to test (10 knobs is a 1000x increase in
the bug search space).

I've also observed that, given knobs, the best you can
hope for is that users will ignore them, because if
they don't, odds are good that they'll be set incorrectly.
This leads into the usual experts-versus-amateurs debate
that comes up in the contexts of memory management,
floating point, I/O buffering, C++-versus-simpler
languages, you name it, but in general, the non-experts
outnumber the experts by some large ratio, and economic
forces (especially in the software world) favor the
larger markets.

What, by-the-way, are the performance gains obtained by
this sort of tailored garbage collection?

Sorry to be such a curmudgeon, but I've been burnt by
this before.  I realize that I am taking a bit of a
defeatist attitude towards bugs (i.e., I assume that
they exist, and therefore take defensive measures like
reducing the size of the bug search space for testing,
wouldn't it be nice if we actually proved all this
stuff correct instead) but in practice, people write
buggy code, and almost nobody is willing to pay for
proven-correct code.  (And it sure would be nice if
we could change that by reducing the costs of proving
code correct, or increasing the perceived costs of
buggy code.)

David Chase
NaturalBridge LLC