[gclist] Age based promotion
Simon Peyton-Jones
simonpj@microsoft.com
Wed, 16 Dec 1998 08:40:40 -0800
We're building a new garbage collector for our Haskell compiler
and have tripped over something which is, I believe, not well
documented in the literature, but which I'm sure some of you
must have tripped over.
Does the enclosed ring a bell for anyone?
Simon Peyton Jones
Age-based promotion
~~~~~~~~~~~~~~~~~~~
The context is to do with using 'steps' (aka 'buckets') within
a generation to make sure that you don't promote an object until
it has lived for a some time (e.g. 3 gcs) within that generation.
(I believe Shaw originally suggested this.)
The problem is this: suppose an object, A, is in step 3 of generation 1,
and is therefore in line for promotion. It may point to another
object, B, also in generation 1, but in step 1. That's fine.
At the next GC of gen 1 we promote A, but not B (on the grounds that
A is old enough and B isn't). But now the A->B pointer crosses
a generation boundary.
There seem to be two alternatives:
(i) record any such pointers in the appropriate rememembered set
(ii) promote B too (and so on recursively)
Well, (i) seems reasonable, but in our system it turns out to be slightly
awkward.
However, even in a system in which (i) is easy, (ii) seems rather
attractive. After all, unless A is subsequently mutated, B, and
everything reachable from B, will stay alive at least until generation
2 is collected. So nothing is lost by promoting B and everything
reachable from it at the same time as promoting A. Indeed, much is
gained thereby, because we don't have to repeatedly move B and
everything reachable from it when GC'ing generation 1.
Several points come to mind
a) If A *is* subsequently mutated, then it might be better *not*
to promote B (and everything...) because it might be dead in a few
cycles time. In Haskell, though, most mutations are updates of thunks,
and an updated thunk is never subsequently mutated. So we are in the
unusal situation of having a mutated object (e.g. A) that we *know*
will not be mutated again. But perhaps other systems have seldom-mutated
objects?
b) Even if A is now non-mutable, B, or B's children, might be
often-mutated
objects. For the same reason as above, one probably wants to cut off
the promotion of the objects reachable from B when one encounters an
oft-mutated one.
c) Promoting the transitive closure of things reachable from B isn't
technically straightforward. After all, B might already have been
evacuated from from-space into to-space in gen 1 by the time A is
promoted. So we might have to re-evacuate it (extra work, but maybe
we don't care), leaving behind... uh... not a forwarding pointer
because we don't expect them in to-space. This, alone, might be
enough to scupper the idea, unless anyone knows a cunning alternative.
d) In our Haskell system, however, things are a bit easier, becauase most
boxed objects can be overwritten with an indirection without causing
the mutator any difficulty -- it has to check for such things anyway.
Any thoughts?