[gclist] Compiler tables for accurate gc

Eliot Moss moss@rhea.cs.umass.edu
Wed, 17 Dec 1997 12:45:58 -0500 (EST)

Recent respondents did remember correctly that Amer Diwan, Rick Hudson, and I
determined how to build tables supporting accurate gc in the face of
aggressive compiler optimizations. This was done for Modula-3 and reported in:

Amer Diwan, J. Eliot B. Moss, and Richard L. Hudson, ``Compiler Support for
Garbage Collection in a Statically Typed Language'', {\em ACM SIGPLAN '92
Conference on Programming Language Design and Implementation}, San Francisco,
CA, June 1992, pp.~273-282.

Your mileage may vary, but I would not immediately call this "a pain". I
_would_ agree that trying to graft it onto a large, existing, complex compiler
was not a totally fun experience, but we managed to do it for the gcc back
end, as it was at the time, with only (talented!) student labor. I also
observe that Modula-3 has some features, notably call by reference, that make
the job harder than it would be for Java (for example).

In certain cases there is a slight impact on normal case code. Basically, if
one turns references to a[i] within a loop over i into the equivalent *p++ in
C, then one must insure that there is a base pointer to a saved somewhere. In
this case, it implies a store before entering the loop.  The tables will
relate this base pointer to p, so that (1) the array a can be found and moved
by the collector, and (2) p will be appropriately updated (even if it is in a
register, etc.). This works even for cases such as references to array
elements a[i] and b[i] in a loop, where a[i] is via *p and b[i] via p[diff]
where diff is the difference in the base address of b amd that of a. You just
need to get it in the tables.

If one is willing to have the collector handle "interior" pointers, then some
of the cases can be eliminated.

In any case, I advocate accurate collection not so much because conservative
collection does not work most of the time, but for these two primary reasons:

1) Conservative collection tends to restrict the algorithms you can use, and
   which algorithm is best (e.g., mark-sweep versus copying) depends
   substantially on application behavior.

2) Conservative collection can occasionally lead to drastic anomalous storage
   retention, and surprises are bad and not necessarily easy for the typical
   application programmer to solve.

More recently I have been working with Dave Detlefs and Ole Agesen, who
developed algorithms for accurate stack tracing for Java (reported in the
OOPSLA gc workshop), on the impact of live variable analysis on storage
retention. It appears that one can gain noticeable improvements even over
_accurate_ tracing, and some programs also exhibit drastic
differences. Further, programmers cannot always program around the effect,
i.e., they cannot always null out a pointer at the "right place" to prevent
anomalous storage retention. The compiler must do it (or, more cheaply, put it
in the tables for the collector and avoid any impact on the mutator's code).
We hope to be able to distribute that paper soon.
J. Eliot B. Moss, Associate Professor      http://www.cs.umass.edu/~moss Web
Computer Science Department, LGRC          +1-413-545-4206             voice
University of Massachusetts, Box 34610     +1-413-545-1249               fax
Amherst MA  01003-4610  USA                moss@cs.umass.edu           email