GC Method (was Questions for Discussion)
Scott L. Burson
gyro@zeta-soft.com
Wed, 28 May 1997 12:14:00 -0700 (PDT)
From: "Bill House" <bhouse@dazsi.com>
Date: Wed, 28 May 1997 07:42:15 -0700
Well, I've been reading up on GC and Baker's Treadmill with incremental-update
and write-barrier synchronization seems promising. There's a paper on this
flavor (Wilson and Johnstone 1993). Performance problems with existing
implementations may be due to the use of smart pointers, according to Jones and
Lins (Garbage Collection, Wiley, 1997). I'm thinking that if you don't use
smart pointers, this might go away.
OTOH, Nettles and O'Toole's replicating collector might also be good. FWIW,
Jones and Lins claim that Nettles and O'Toole's replicating collector (NORC)
and Baker's Treadmill (BT) are the most promising candidates for real-time GC
among software-only GC algorithms so far. There are worst-case timing for NORC
at 50 microsecs per atomic action. BT implementations surveyed don't yet match
this, but again they may be optimizable by avoiding smart pointers.
Comments?
I would expect the Brooks GC to be superior to either of those. (Reference:
Brooks, Rodney A. Trading Data Space for Reduced Time and Code Space in
Real-Time Garbage Collection on Stock Hardware. In Proc. 1984 ACM Symposium
on Lisp and Functional Programming, Austin, TX, pp. 256 - 262.) It is a
variant of Henry's original incremental copying collector.
In the original version of the Brooks GC, an additional pointer is allocated
at the head of each object; it initially points to the following word, where
the object's body resides. All accesses to the object are indirect through
this pointer. When the object is copied from oldspace to newspace, the
pointer is updated to point to the new copy. Thus the indirection, taken on
every access, serves as the read barrier. (The write barrier, which must
ensure that an object is in newspace if a pointer to it is written into
newspace, is more complicated, but since writes are less common than reads,
the cost is not too high.) A couple of other things, notably EQ, get a little
more complicated, but basically I think this is a very viable approach.
In the variant I am using in ZetaBase, all the indirection pointers are
gathered together into a "handle table" which is separate from the areas where
the object bodies are stored. This allows the handle to remain the same for
the life of the containing process, and therefore makes EQ fast again, but
imposes some additional complexity, notably the need to GC the handles
themselves. It also puts an additional load on main memory and the cache that
Brooks' original version does not (in his version, since the body of the
object normally lies just after the indirection pointer, the two will usually
be on the same page and in the same cache line).
I don't know how well ZetaBase does on the 50usec time scale; probably not
too well, as I was aiming at more like 500usec, and there are probably
occasional atomic actions in the millisecond range. I was definitely
concerned with "soft" real-time -- what would be perceptible to a human user
-- rather than the "hard" real-time world of interrupt handling and the like.
-- Scott