[gclist] GC debug (was: What to say about GC)

William D Clinger will@ccs.neu.edu
Thu, 01 Aug 1996 14:08:24 +0000


Here are a few of the things I have learned about debugging
compilers and garbage collectors for Scheme-like languages.

 *  It is easier to debug a precise copying collector than a
    conservative or non-copying collector, because catastrophic
    bugs are easier to detect.

 *  Reliability is far more important than efficiency.

 *  The garbage collector's invariants must be simple and
    explicit.

 *  With a precise copying collector, most of the apparent gc
    bugs are really compiler bugs, in which the compiler has
    violated some gc invariant.

 *  Most of the apparent gc bugs that are not compiler bugs are
    likely to be bugs in low-level runtime code, not bugs in the
    garbage collector itself.

 *  The ability to check all gc invariants at every allocation
    or side effect is useful, even if it is a million times as
    slow as not checking.

 *  Most catastrophic violations of a gc invariant can be
    detected by running a full gc 2 or 3 times in succession.
    The ability to trigger this kind of multiple gc after the
    Nth allocation or gc is very useful; with repeatable bugs
    you can use binary search to isolate the source of the bug.

 *  The ability to find all live objects of some specific type
    is useful for detecting storage leaks as well as for debugging.

 *  Many storage leaks can be detected by using the system to
    rebuild itself dynamically.  If you're careful to initialize
    global counters appropriately, then the bootstrapping process
    should stabilize at a bit-level fixed point.

 *  Registers and stack frames cause storage leaks unless you're
    careful.  To find such leaks, it helps to have a compiler
    switch that will generate code to clear all dead registers
    and stack slots at certain times, such as procedure calls,
    returns, and allocations of a stack frame.

 *  The ability to run the machine code for a compiled program
    with several different kinds of garbage collectors or write
    barriers is extremely useful.  Sometimes you can get the
    effect of different kinds of garbage collectors by changing
    the parameters of a single collector.  For example, a
    generational collector that uses stop-and-copy for the
    youngest generations and mark-sweep for the oldest generations
    can be made to simulate a pure stop-and-copy or mark-sweep
    collector by setting the size of some generations to zero.

 *  Fencepost errors at heap boundaries are common, especially
    if you play tricks to avoid testing for boundary cases
    during allocation.

 *  GC statistics are more useful for tuning than for debugging.
    Examples: the number of objects allocated; the number of
    objects promoted; the number of write barrier transactions.

Will Clinger