[gclist] Compiler specialized GC's

Charles Fiterman cef@geode.geodesic.com
Wed, 21 Feb 1996 12:58:07 -0600


> I'm curious if there's been any research for compiler supported, GC
> specialization. That is has then been any work on compilers that at compile
> time generate a specialized/GC runtime system to take advantage of program
> specific properties. For example, if you don't mind losing modularity, one
> could imagine a compiler analyzing all the different kind of heap objects a
> program could allocate and hard coding a specialized GC that was specialized
> to collect the object, rather than interpreting bitmaps or other type
> information at runtime, one could also imagine using specialized copying
> routines too, if you're gc scheme need to copy. I guess, my real question is
> are these kind of optimizations going to be swamped by other factors like VM
> interactions or tagging/untagging costs.

What this amounts to is this. Each class has a table which says where the
pointers are. That table can be specialized into a function which calls
mark directly avoiding the intrepretation of the table. Further for special
objects such as stacks the generated function can be overriden. In the
stack example you only want to look for pointers in the live portion of
the stack. Or you can find references that don't look like pointers.

We support this in our C++ wrapped pointer classes. Objects decending from
class gc get tables and a mark function that uses them. The mark function
can be overridden.

This can have some very creative uses. For example suppose you have a two
pass compiler. In the first pass lots of include files get read and lots
of structures created that are never used. These structures are kept on
a hash table so they can be found. The hash table has a special mark function
which keeps all table entries alive during the first pass. In the second
pass some table entries will be referenced in ways other than the hash
table and must be kept alive others will not and may be deleted. The hash
table's mark function becomes a no-op during the second pass. When the
destructors for the objects are called they remove themselves from the
table.

By our measurments there is no speed advantage in specialising the
mark functions of typical functions. Call overhead is about as bad
as the intrepretation of bitmaps. But for special cases like stacks
and queues there is a big advantage. The collector uses the programmers
knowledge of a structure to act more intellegently. For some cases
programmer defined mark functions allow the most logical encapsulation
of function. Consider the hash table example. All kinds of things will
act to keep a table entry alive, distributing that knowledge, preventing
two table entries from keeping each other alive, would be awkward and
error prone. This solution is clear and elegant.