[gclist] conservative vs. "precise" gc

boehm.PARC@xerox.com boehm.PARC@xerox.com
Mon, 26 Feb 1996 09:27:17 PST


Tyson wrote about conservatively scanning the hottest stack frame:
"This form of conservatism is quite attractive, but I'm not convinced it
buys you much by itself. Certainly, you don't have to conservatively
scan the rest of the root set. However, you won't know the types of
anything in the top activation record, so it could be that you to have
to scan the heap quite conservatively (if you find a pointer into the
heap in the top activation record). It's possible that the next 200
words on the heap could all be part of that object, or even the next
200000 words.

In a GC scheme with its own allocator, you would know this information, so
that's fine."

My assumption was indeed that the collector has access to at least object size
information, and in this context probably descriptors for heap objects as well.
This is probably the most common situation in implementations with a "precise"
collector, but not the only possibility.  The latter aren't completely
necessary, since, as Tyson points out, you can conservatively scan heap objects
referenced only from the last stack frame, of which there probably won't be any
in the typical case.

I'm not a fan of conservative collectors that are built directly on malloc
without another allocation layer.  There are many performance problems with
such an approach:

1. You generally lose the performance advantage of such a collector over
malloc/free: Deallocation can be much cheaper since it's not done one object at
a time and garbage collections are a great time to do deferred object
coalescing.  That's why our collector often beats malloc/free for programs with
small average object size (especially on systems with slow malloc/free
implementations).

2. If you don't keep track of object sizes, interior pointers can't be handled
correctly.  Even if you do keep track of sizes, I don't know how to handle
interior pointers quickly without controlling the allocator.  I believe that
some interior pointers should be recognized, if C/C++ interoperability is an
issue.  Even if it isn't, it leads to fewer optimizer constraints.

3. You lose page level locality, since it's hard to keep free lists
sufficiently sorted.  It's also probably harder to implement the sweep phase so
that it doesn't generate page faults and lots of cache misses.

4. It's harder to allocate around known false references.

As Tyson said, if you have your own allocator, you will have object sizes
available.

Hans
Standard disclaimer ...