[gclist] gc concerns

Boehm, Hans hans_boehm@hp.com
Wed, 13 Dec 2000 11:13:48 -0800

Your first question might be addressing either one of two somewhat separable

1) Response time.  The client may be stopped for long enough that the pause
becomes visible to the user.

2) Multiprocessor throughput.  Even if you perfectly parallelize the GC
client, the garbage collector may make the application single-threaded for a
significant fraction of the execution time.  If 1/5 of the cycles are spent
in the single-threaded portion of the GC, you'll be epnding half the
wall-clock time in the GC on a 4 processor machine.

Our collector (and I believe earlier versions of the Geodesic collector)
could address the first point by running the GC in incremental mode, i.e. by
doing a smaller amount of work more frequently during allocator calls.  This
requires trapping the initial write to a page during each GC cycle.

Empirically, this mode was not used a lot, probably for some combination of
the following two reasons:

1) Most modern computers are memory-poor in comparison to the processor
speed.  Hence there is strong motivation to keep heap sizes down, and thus
most applications run with heap sizes that are small enough that
stop-the-world collections finish in a reasonable amount of time, and GC
pause times are often not much of an issue.

2) Tracking memory writes in C/C++ code is usually dificult in the presence
of system calls.  The upshot of this is that system calls that may write to
the heap usually have to be wrapped to ensure that the system call itself
does not try to trap into the GC code (which wouldn't work).  Although
incremental GC is trivial to use for toy applications, this makes the
incremental collector more difficult to use than the stop-the-world one for
real, large applications.  (This is fixable with kernel support.  We may try
to address this on some Linux platforms.) 

Whether or not pause times are an issue for you will depend mostly on the
amount of reachable data you expect to have in the heap.

The second point (MP scalability) is starting to be addressed by some recent
collectors, both for Java and C/C++.  Our current test version of the
collector should scale reasonably well to 4 or 8 processors.  It still stops
the world, but the garbage collector uses all available processors while the
client is stopped.  This code has not been ported to Windows 2000/NT, and it
is not yet in the more stable versions of the collector.  The group at the
University of Tokyo has a more established version of our collector that
appears to scale even better, but has somewhat lower performance for small
numbers of procesors.  I don't recall seeing Windows support, but it might
be there.  I'm not sure about the Geodesic Systems products in this area.
(A few JVMs ofer similar facilities.)

Neither our collector, nor the Tokyo one, allows you to combine incremental
and parallel GC.  That would raise some interesting design issues, but I
think there is no fundamental roadblock; it just hasn't happened yet.  (Ours
sort of tries to support it, but it doesn't really do the right thing if you
enable both.)

Whether or not scalabilty issues are an issue on 4 processors will depend on
how much time your application will spend in the allocator/collector.  For
allocation-intensive applications it probably will be an issue.

Both the Geodesic Systems and our collector will usually accomodate
third-party malloc/free applications, especially if you arrange for the
third party malloc calls to be redirected to the GC.  Geodesic has spent a
lot more time making that work under Windows than we have.


> -----Original Message-----
> From: Mike Roberts [mailto:miker@artifact-entertainment.com]
> Sent: Wednesday, December 13, 2000 10:17 AM
> To: Gclist
> Subject: [gclist] gc concerns
> hello all. i am lead developer of a server application for a 
> multiplayer
> game company. i'm looking into using c/c++ based garbage 
> collection for our
> server design. i have some questions that i was hoping 
> someone could answer.
> * if i'm not mistaken, conservative and copying collectors 
> suspend all other
> threads while collecting. we're probably going to use a 4 or 
> 8 processor
> server box, so would this a real performance concern? if so, 
> are there any
> reliable incremental or concurrent collectors available for 
> windows c++
> environments?
> * i've noticed that collection isn't a partial commitment. 
> since we are
> using third party toolkits that haven't been written for use 
> with garbage
> collection, would this present a compatibility problem?
> --
> Mike Roberts (zoarre)
> Developer
> Artifact Entertainment, L.L.C.
> email <miker@artifact-entertainment.com>
> web <www.artifact-entertainment.com>
> phone 480-218-1223 ext 23
> "C++: an octopus made by nailing extra legs onto a dog."
> -unknown