[gclist] Precise GC's performance

Mike Spertus mps@geode.geodesic.com
Mon, 4 Mar 1996 17:50:22 -0600


> From maki!iecc.com!owner-gclist Fri Mar  1 05:03 CST 1996
> >Received: from miso.wwa.com by maki.wwa.com with smtp
> 	(Smail3.1.29.1 #3) id m0tsRg4-000rOBa; Fri, 1 Mar 96 04:03 CST
> Subject: Re: [gclist] Precise GC's performance
> To: gclist@iecc.com
> > The question is what kind of collector you have for precise collection.
> > If you have a generational collector then the malloc becomes faster than
> > an ordinary malloc. Substuting the null collector is not a fair test.
> > YAFL must somehow figure out how to free storage. That has a cost. Manual
> > adhoc methods of collection are often far more expensive than highly
> > the tuned ones you find in garbage collectors.
> 
> Please understand that this was not the issue of my experiment: the
> purpose here is not to measure the performance of GC vs a non-GC system,
> but to measure the performance penalty of a given GC design (precise)
> when compared with another approach (Conservative). The measure I made
> was related to what happens as long as GC is not required. It does not
> include any measure about GC itself.
> 
Actually, I believe Charles' answer is very relevant to your question.
One of the main differences between conservative and precise collection is
that precise GC can use copying collection. Copying collectors have a much
faster malloc than mark-sweep collectors. If collections rarely take place
(i.e. huge space overheads are allowed), collector performance is dominated
by allocator performance. In this case, copying collection and therefore
precise collection is faster than conservative collection. In the real
world, this is not very meaningful. I remember visiting the FOX project at
CMU (a TCP/IP stack written in ML), the speed of collection was excellent,
but the memory footprint of the TCP/IP stack was several tens of megabytes,
which is completely unacceptable for a real system.

Conversely, if you require that the collector keep a very small space overhead,
then conservative collection is much faster because a mark-and-sweep collection 
cycle is cheaper because frequent collections force live data to be
greater than say 75% of total data, which nullifies copying collection's
theoretical advantage of only having to touch live data.

In multi-threaded environments, performance requirements almost demand
partially conservative operation or highly non-portable techniques like
implicit tags. (Creating local variables and register loads need to be
atomic).

What does this mean with respect to real world programs and performance
measurements that consider both space and time. Performance is essentially
comparable. In fact, conservative collection seems to be faster in practice
but there is not enough data to be certain. This much seems clear, though:
Unless your application and computer can tolerate extremely large space overhead 
and collect extremely infrequently, there will be no performance reason not to use 
conservative collection.

As Hans has pointed out, fragmentation does not appear to be a problem with
well implemented non-moving allocators.

A more significant factor in choosing between conservative and precise collections
is that extremely large heaps in a 32 bit environment may start to suffer from
pointer misidentification. A good rule of thumb with our collector is that heap
sizes over about twenty megabytes may well require leaf identification. Much
of the work we did in setting collector defaults was to allow for as large of
heap sizes as possible. So partial precision at least (i.e. identifying leaf objects)
becomes a significant concern for very large heaps. It is easy to imagine an
application with a 100 megabyte heap consisting entirely of small pointerful objects
(there are finesses that help collect large objects). Such an application would
require precise collection on the heap, although scanning the stack and registers
conservatively would be fine. Fortunately, programs that cannot be conservatively
collected are almost vanishingly rare. Although we have run into real-life
programs that require leaf detection, they are still quite uncommon.

> Of course, in the real world, YAFL has to free storage. I do agree with 
> you when saying that manual allocation methods are rarely more efficient 
> than automatic ones.
> 
You're preaching to the choir here. But remember, the precis of the above
argument is that precise collection methods are rarely more efficient than
automatic ones.

> 
> 
> Darius
>