[gclist] Prior work?
Boehm, Hans
hboehm@exch.hpl.hp.com
Wed, 9 Feb 2000 10:47:37 -0800
I also remember this idea dating back to at least Algol 68 days. My limited
experience with it comes from the DEC (now Compaq) SRC Modula-3 compiler.
My impression is that it depends completely on the details. Interpreting a
bitmap or a vector of pointer indicees describing object layout is often
very cheap. If you can move pointers to the beginning of the object, at
least in most cases, things get even cheaper. Interpreting a type
descriptor not designed for GC can be expensive.
Scan procedures can be rather expensive, if you're not careful. Explicit
procedure calls can easily dominate the interpretation overhead. The amount
of code you end up generating for scan procedures can be substantial. This
has effects on both process size and icache misses.
Overall, I'm not convinced custom mark procedures are a good idea. As I
rememember, the ISMM paper cited below does give results competititive with
our conservative collector. But there were other variables. Notably, the
SmallEiffel collector doesn't need to go through the pointer validity check
required by the conservative collector. I think Vincent Delacour had some
measurements around 1992 suggesting that in some cases this was roughly half
the GC time. (It also enables some other optimizations in the mutator, but
that doesn't affect the timings here, I believe. Also the validity check is
probably relatively less expensive now or in 1998, since memory accesses
dominate more.)
I think John Ellis made the observation that this may be one of the cases in
which compiled code is more expensive than interpretation.
Of course, it's obvious from this discussion that some really careful
apples-to-apples comparison of the alternatives would be very useful. I
don't know of such a comparison.
Hans
-----Original Message-----
From: Pekka P. Pirinen [mailto:pekka@harlequin.co.uk]
Sent: Wednesday, February 09, 2000 9:53 AM
To: danwang@CS.Princeton.EDU
Cc: gclist@iecc.com
Subject: Re: [gclist] Prior work?
rather than using a generic scan function that
interprets tags at runtime, statically emit a specialized scan function
based on the static type info known at compile time. I'm wondering what
the performance tradesoffs are.
There was a paper about the SmallEiffel system at ISMM'98 (Compiler
Support to Customize the Mark and Sweep Algorithm, Dominique Colnet,
Philippe Coucaud and Olivier Zendra) that had good results both in
memory footprint and GC time. It has some interesting references,
too.
People have tried this from time to time, particularly where tags are
not an option, like C++. Look for "tagless" or "tag-free" collection.
Basically, it sounds good if you have the type info. Should be faster
than interpreting the tags. Make sure you control the recursion,
though.
--
Pekka P. Pirinen
Adaptive Memory Management Group, Harlequin Limited