Linux and GC

Mike McDonald
Tue, 12 May 1998 13:24:42 -0700 (PDT)

>From  Tue May 12 12:51:19 1998
>Resent-Date: Tue, 12 May 1998 15:21:40 -0400 (EDT)
>Subject: Re: Linux and GC
>Date: Tue, 12 May 1998 15:21:53 -0400
>From: P T Withington <>
>To: "Kragen" <>, <>
>Mime-Version: 1.0
>Resent-Message-ID: <"B8koB2.0.ww5.45AMr"@math>
>X-Mailing-List: <> archive/latest/2301
>On 4/28/98 18:17, Kragen wrote:
>>One particularly interesting garbage-collection method involves copying
>>the entire VM of a process, scanning it for garbage while the program
>>continues to run, then communicating back to the process what objects
>>were discovered to be garbage, so they can be freed.
>>In Linux (or Unix in general), you can copy the entire VM of a process
>>by calling fork().  In Linux, this is fast, because the memory doesn't
>>actually get copied until it's written to.  In other Unices, it is
>>often not so fast, especially for processes with a lot of data.

>Wilson's paper covers the pros and cons of nearly every known GC 
>technique.  In general, Snapshot-at-beginning collectors are efficient 
>because they only have to notice mutator writes, which are infrequent 
>(and, as you propose, to do a copy-on-write of any page the mutator would 
>modify is supported directly in the O/S), but they are inefficient 
>because objects that become inaccessible _during_ the collection will not 
>be collected (e.g., even if the mutator nulls every pointer to an object, 
>if there was one reference to it at the beginning of the collection, it 
>will still be preserved).

  Is this really a serious drawback? They will be collected in the
next pass, afterall. Is the overhead of maintaining the write barriers
justified for collecting this small percentage of new garbage?

>Most snapshot-at-beginning implementations do not actually make a 
>snapshot of the whole image, but rather arrange a write barrier, so that 
>any changes to the image can be recorded and dealt with (e.g., by 
>recording every value that gets overwritten).
>What I find really interesting about your idea is that normally the GC is 
>run as a co-routine with the mutator, but in these days of multiple-cpu 
>machines, it really makes sense for the GC to be a separately schedulable 
>thread.  Incremental collectors use _some_ form of read or write barrier 
>to discover the changes that the mutator is making to the graph, but they 
>also want the collector to have access to all of the graph.  A thread 
>necessarily will have the same VM image as the mutator, but by forking 
>the GC as a separate process, it can have a different set of protections 
>for its own use.

  Isn't the problem of using multiple cpus for GC one of balance? One
would like the collector threads to have enough resources so that they
stay ahead of the mutators. But you don't want to dedicate CPUs to
collector threads because then you waste those resources when not
collecting. Then there's the problems with adding the garbage back on
the free list. You'll need some kind of lock for that. MP locks tend
to be "expensive". Do you incrementally add things to the free list or
do it all at the end? Incrementally has the advantage of reusing the
space as quickly as possible. The disadvantage is the constant locking
and unlocking of the free list lock. Doing it at the end of a gc pass
means I only have one short access period to the lock. Disadvantage is
the mutators have to wait longer for the space to be reclaimed.
Depending on the allocation profile of your app, it seems either could
be the better way to go.

  Mike McDonald