[gclist] GC debug (was: What to say about GC)
William D Clinger
will@ccs.neu.edu
Thu, 01 Aug 1996 14:08:24 +0000
Here are a few of the things I have learned about debugging
compilers and garbage collectors for Scheme-like languages.
* It is easier to debug a precise copying collector than a
conservative or non-copying collector, because catastrophic
bugs are easier to detect.
* Reliability is far more important than efficiency.
* The garbage collector's invariants must be simple and
explicit.
* With a precise copying collector, most of the apparent gc
bugs are really compiler bugs, in which the compiler has
violated some gc invariant.
* Most of the apparent gc bugs that are not compiler bugs are
likely to be bugs in low-level runtime code, not bugs in the
garbage collector itself.
* The ability to check all gc invariants at every allocation
or side effect is useful, even if it is a million times as
slow as not checking.
* Most catastrophic violations of a gc invariant can be
detected by running a full gc 2 or 3 times in succession.
The ability to trigger this kind of multiple gc after the
Nth allocation or gc is very useful; with repeatable bugs
you can use binary search to isolate the source of the bug.
* The ability to find all live objects of some specific type
is useful for detecting storage leaks as well as for debugging.
* Many storage leaks can be detected by using the system to
rebuild itself dynamically. If you're careful to initialize
global counters appropriately, then the bootstrapping process
should stabilize at a bit-level fixed point.
* Registers and stack frames cause storage leaks unless you're
careful. To find such leaks, it helps to have a compiler
switch that will generate code to clear all dead registers
and stack slots at certain times, such as procedure calls,
returns, and allocations of a stack frame.
* The ability to run the machine code for a compiled program
with several different kinds of garbage collectors or write
barriers is extremely useful. Sometimes you can get the
effect of different kinds of garbage collectors by changing
the parameters of a single collector. For example, a
generational collector that uses stop-and-copy for the
youngest generations and mark-sweep for the oldest generations
can be made to simulate a pure stop-and-copy or mark-sweep
collector by setting the size of some generations to zero.
* Fencepost errors at heap boundaries are common, especially
if you play tricks to avoid testing for boundary cases
during allocation.
* GC statistics are more useful for tuning than for debugging.
Examples: the number of objects allocated; the number of
objects promoted; the number of write barrier transactions.
Will Clinger