[gclist] Thread safe GC

Giuliano Carlini GCARLINI@us.oracle.com
14 Apr 97 15:44:55 -0700


--=_ORCL_35098718_0_11919704141646420
Content-Transfer-Encoding:7bit
Content-Type:text/plain; charset="us-ascii"

> We found that a huge problem was that many of the heaps we built
> contained a lengthy 'spine' of live data which had to be scanned
> sequentially.  Even though we had two dozen processors trying to
> do GC work, all but one of them were idle after an initial flurry
> of activity, and had to wait for a long, deep list to be traversed.

Were mutator threads running in parallel with collector threads?
This paragraph implies they weren't. If the mutator was also
running do you think that more GC threads could run in parallel?
I think they could.

A GC which runs in parallel with the mutator must handle the mutator
changing an already scanned (a "clean" object). When this happens,
the object becomes dirty and must be added to the scan queue. On stock
hardware we need to aproximate this by marking pages as clean and dirty.

It seems that the mutator would be constantly adding items to the scan
queue which could be scanned by different GC threads.

g

--=_ORCL_35098718_0_11919704141646420
Content-Type:message/rfc822

Date: 14 Apr 97 13:17:15
From:"Jim Larson <larson@kesey.jpl.nasa.gov>" <majordom@iecc.com>
To:gclist@iecc.com
Subject:Re: [gclist] Thread safe GC
Reply-to:gclist@iecc.com
Return-Path:<majordom@iecc.com>
Sender:majordom@iecc.com (MajorDomo)
X-Authentication-Warning:kesey.jpl.nasa.gov: Host localhost.jpl.nasa.gov didn't use HELO protocol
In-reply-to:Your message of "Mon, 14 Apr 1997 09:24:25 PDT."             <9704140924.ZM13104@hoh.mti.sgi.com> 
Precedence: bulk
Errors-to:owner-gclist@iecc.com
MIME-Version: 1.0
Content-Transfer-Encoding:7bit
Content-Type:text/plain; charset="us-ascii"

In message <9704140924.ZM13104@hoh.mti.sgi.com> Hans Boehm writes:
>- for a massively parallel system, you want to allow more than one
>thread doin g useful GC work at the same time.  I don't know of too
>many implementations tha t support this.  (One of ours does, but only
>to a very small extent.)  I don't believe there are any fundamental
>problems here.  But I suspect it's hard to get the details right.

I wrote a mildly-parallel (up to 24-processor) parallel GC as part
of the runtime system for a committed-choice logic language (similar
to Mercury).  The details aren't too hard to get right - at least
no harder than any other part of a parallel runtime system - the
difficulty lies in getting good parallel speedups out of it.

We found that a huge problem was that many of the heaps we built
contained a lengthy 'spine' of live data which had to be scanned
sequentially.  Even though we had two dozen processors trying to
do GC work, all but one of them were idle after an initial flurry
of activity, and had to wait for a long, deep list to be traversed.

Any parallel code has to be tuned to eliminate bottlenecks or hot
spots, but parallel GC introduces a problem that's harder still.  The
programmer must find a way to let the parallel GC run, but cannot do so
by optimizing the GC code.  Instead, the programmer must create data
structures in such a way that the GC scanning can run in parallel.  So
the code to optimize in inaccessible and its data set is created
indirectly - and all the while the programmer has to make sure that
normal execution runs quickly too.

For this reason, most parallel GC'd systems tend to focus on having
good single-threaded concurrent collectors rather than
'stop-the-world' parallel collectors.

References to our research available upon request.

Jim Larson
James.S.Larson@jpl.nasa.gov

--=_ORCL_35098718_0_11919704141646420--