[gclist] Multi threaded reference counting

Henry G. Baker hbaker@netcom.com
Sun, 19 Jan 1997 13:03:37 -0800 (PST)


> In the distributed GCs I've seen only one thread owns an object.
> I need a shared memory GC where every thread can access the
> contents of every object. In that environment you can have a race
> condition where one thread is decrementing the reference to an object
> while another is incrementing it :

Back in the late 1960's/early 1970's, it was a sport to see how to
implement test-and-set type mutual exclusion algorithms using only
shared memory.  A number of algorithms were published in the CACM,
with more-or-less the final one being published by Dijkstra.  In his
paper, he made a comment to the effect that 'in order to stop a flood
of papers in which a 2 processor solution is extended to 3 processors,
and then to 4 processors, and so on', he produced a proof for the
arbitrary-N case.

I did a little bit of work on this problem, and the general solutions
go something like this.  Each processor is given a memory location that
it 'owns', in the sense that everyone else promises not to write into it,
but only to read from it.  All processors have read access to everyone
else's 'owned' location.  (There are variants of these solutions which
also have one location which is written to by everyone in a massive race
condition, after which everyone backs off to see who won the race.)

The 'owned' locations are used as a kind of mailbox where a processor
posts information about its current state, and everyone else can read
it.  This convention provides a kind of polled broadcasting facility.

One (expensive) suggestion for reference counts might give each
processor its own portion of a reference count, and the 'real'
reference count is the _sum_ of all of the processors' reference
counts.  Alternatives might include a small per-processor reference
count, which when overflowed or underflowed would require more
synchronization to reapportion.

It has already been proved that distributed GC is more-or-less
equivalent to full synchronization of n processors (a European PhD
thesis circa 1989 -- sorry, no reference off the top of my head).  GC
appears to be more robust in this environment than reference counting,
since you do a major synchronization every once in a great while,
instead of constantly tripping over one another doing minor
synchronizations.

Boehm's GC hack using page tables is a pretty cool solution to the
problem.  I don't recall the reference off the top of my head, but it's
something around 1990 (POPL?).

-- 
Henry Baker
www/ftp directory:
ftp.netcom.com:/pub/hb/hbaker/home.html