GC performance - was Re: [gclist] GC topics

fjh@cs.mu.OZ.AU fjh@cs.mu.OZ.AU
Fri, 23 Feb 1996 02:33:02 +1100 (EDT)


Darius Blasband, you wrote:
> 
> > Darius Blasband, you wrote:
> > > 
> > > In any case, I think we agree on the fact that precise GC is far easier,
> > > and globally more reliable and faster than conservative GC's...
> > 
> > I definitely disagree about precise GC being easier.
> >
> Let's say that it is surely easier to implement (from scratch), since
> it requires almost no knowlegde about the internals of the machine,
> memory mapping of the compiler, etc...
> 
> > Plugging in Hans Boehm's conservative garbage collector to our
> > Mercury implementation was very easy.  It took a couple of days at most.
> Well, I guess that plugging in a GC is indeed easier than writing
> one's own, but is this the issue ?

If there were precise garbage collectors freely available which one could
easily plug into a language implementation, then it wouldn't be an issue,
but there aren't, and I think there are some important technical reasons
which mean that using a precise garbage collector is always going to
be more effort than plugging in an existing conservative garbage collector.

> > Writing our own precise GC has so far consumed the work of one honours
> > thesis and two summer studentships (i.e. about 6 man-months of work),
> > and it still isn't complete.
>
> I simply do not understand: it does not match my experience at all, and from
> what I know of Mercury (purely logic typed language, generates intermediate 
> C, am I right ?) there is nothing which should explain why it is so
> difficult. Maybe the simple fact that it is being implemented on an existing
> system, instead of being part of the original design. I don't know. My
> experience about implementing a precise GC tells exactly the opposite.
> Any clue on the reason for this ? Anything too specific about Mercury ?

A lot of the work in the Mercury precise garbage collector has been in
getting the compiler to construct the appropriate tables of type information.
Most of this effort is unavoidable for precise garbage collectors if
you want to use efficient data representations in which pointers and
non-pointers are not explicitly marked as such.

Particular language constructs will affect what sort of type information
is needed.  For example, one complication in the case of Mercury is
polymorphic predicates, for which the type of one variable depends on
the value of a special compiler-introduced type_info variable.
When compiling with precise garbage collection enabled, the compiler
needs to extend the life-time of these type_info variables beyond
what is otherwise needed. 

Another complication is closures, the runtime data structures used to
store curried higher-order predicate terms.  We store just the address
of the predicate, the number of curried arguments, and the curried
arguments.  The precise garbage collector will need to determine the
type of the curried arguments.  This can be done by having the compiler
emit a table mapping predicate addresses to argument types.

Another complication in Mercury is partially instantiated data structures.
The Mercury compiler generates code in which data structures are not
fully initialized.  Unless you're willing to take the performance hit
of fully initializing all data structures, this requires additional
complication in the generation of the info for garbage collection.
(We're implementing precise garbage collection in the hope of improving
performance, so we don't want to take this hit.)

Of course, it would have been easier to implement precise GC if we
had implemented it before we implemented higher-order predicates,
polymorphism, and so forth.  But then the implementation of those
constructs would have been more difficult!

Implementing a language with N different features is of difficulty not
much worse than order N if those features don't interact, but worse
than N squared if they do.  Conservative garbage collection doesn't
really seem to interact much with most other features, the main
exceptions being concurrency and the safety of various optimizations.
But precise garbage collection generally does interact with other features.

Incidentally, the problem of feature interaction is also quite bad
for manual allocation -- for example, the interaction with exception
handling is particularly nasty.

(BTW, before you come away with the impression of Mercury as a language
with lots of complicated features that interact in complicated ways ;-),
I should point out that although the features in Mercury can
sometimes interact in ways that make life a little trickier for the
implementor, feature interaction generally doesn't cause any problems
for Mercury programmers.  At the source code level, the underlying conceptual
model is very clean and the features are basically all pretty orthogonal.)

-- 
Fergus Henderson             	WWW: http://www.cs.mu.oz.au/~fjh
fjh@cs.mu.oz.au              	PGP: finger fjh@128.250.37.3