[gclist] write barrier

Paul R. Wilson wilson@cs.utexas.edu
Mon, 11 Mar 1996 09:09:05 -0600


>From owner-gclist@iecc.com Mon Mar 11 06:28:03 1996
>To: gclist@iecc.com
>Subject: Re: [gclist] write barrier
>Sender: owner-gclist@iecc.com
>Precedence: bulk
>Reply-To: gclist@iecc.com
>Errors-to: owner-gclist@iecc.com
>
>geert@sun3.iaf.nl (Geert Bosch) wrote:
>
>> Why wouldn't you implement write-barriers using the virtual memory
>> management units all systems have nowadays?
>
>OK - my text is excessive, I should have mentionned that barriers can
>be implemented by faulting, not only by instrumenting code. Please, do
>you have biblio references of GCs that use faulting ? I have none.

The PARC conservative incremental generational GC is a good example.
The use of a pagewise write barrier based on VM traps makes it
possible to implement generational and incremental GC for programs
compiled with random compilers that don't put in write barrier
instructions.

The Appel-Ellis-Li GC uses a pagewise read barrier to implement a variant
of Henry Baker's incremental copying algorithm.  I believe there was/is an
incremental variant of Joel Bartlett's mostly-copying GC that used a read
barrier like this. (?)

>[...]
>
>> When a the mutator tries to change something on a page which is not
>> (completely) scanned yet, the collector can mark it as being changed
>> or (better IMO) first scan that page and let the mutator continue
>> afterwards.
>
>Scanning the page is possible, but not straightforward.

I believe it's pretty straightforward.

> At the time of
>the fault, the page typically contains some white objects, and the GC
>does not know whether these are reachable or not. The GC cannot
>therefore just scan the page entirely. It will scan gray objects
>only.

I assume that here we're talking about an incremental update tracing
algorithm, and we are reverting written-to black objects to gray
to preserve the tricolor invariant.  (The alternative would be to
gray the objects that the written-to black objects hold pointers
to.  This is harder or more expensive when you're using a VM system
to tell you which objects are (conservatively) written to.)

The grey objects it needs to scan are the ones that were black
before the write barrier was triggered---conceptually, the write barrier
operation converts black object in a page to grey, so that if they
hold pointers to white objects, those white objects will be found.

> For a white object, the scanning cannot be omitted, because the
>object may be reachable;

Yes, it can be omitted.  All the write barrier has to do is to keep
the mutator from confusing the GC by hiding pointers to white objects
inside black objects.  It does NOT have to keep track of stores into
white objects.

If the white object is reachable and stays reachable, it will be
reached by the GC's normal tracing anyway, which is fine.  If the
white object becomes unreachable before the GC finds a path of pointers
to it, that's even better---the object will NOT be found, but only
because it's garbage.  The GC will correctly reclaim its space.

The moral of this is that it's the tracer's job to ensure that reachable
white objects are found, and it's the write barrier's job to ensure
that the mutator doesn't confuse the tracer by hiding pointers to white
objects in black objects.  The mutator can mutate white objects
all it wants without the write barrier needing to notice, and without
confusing the tracer---the tracer sees the graph of objects only up
to the "wavefront" at the leading edge of its traversal.  It can safely
ignore what happens beyond that horizon.

Allocating white can help with this.  Most objects will be allocated
white and STAY white until they become garbage, and the tracer may
never encounter them at all.

> it cannot, either, be done in the ordinary
>way and result in marking pointed-to objects gray, because the object
>is not guaranteed to be reachable.  Scanning a white object must
>therefore result in the GC establishing a table which lists all the
>pointers to white objects found in this object. If later the object is
>found reachable (ie. becomes gray), the objects listed in the table
>will also be grayed.
>
>
>This approach is more complicated that normal three-color marking.
>Does anybody use it ?
>

No, because it's not necessary.  Have a look at the long version of my
GC survey if this is not clear.


| Paul R. Wilson, Comp. Sci. Dept., U of Texas @ Austin (wilson@cs.utexas.edu)
| Papers on memory allocators, garbage collection, memory hierarchies,
| persistence and  Scheme interpreters and compilers available via ftp from 
| ftp.cs.utexas.edu, in pub/garbage (or http://www.cs.utexas.edu/users/wilson/)