[gclist] ref-counting performance cost

Daniel Wang danwang@CS.Princeton.EDU
11 Oct 2000 11:32:12 -0400


Andrew Shi-hwa Chen <chenandr@cse.msu.edu> writes:
{stuff deleted}
> > I totally agree that RC collectors are very hard to code. One slip and
> > you've got a leak... My view is that RC collectors deserve more research
> > because they are so popular. RC collectors are interesting because of  the
> > optimisations that can be worked. RC collectors can be used in real time.
> > Add to that the spice of work-arounds for the cyclic problem makes for a
> > fertile research area. I don't understand why academics shun working on RC
> > collectors. Is it easier to publish papers on GC than RC?

{stuff deleted}
> And thus, since the area of garbage collection
> (particularly publishing therein) is seen as highly
> performance driven (or at least percieved as such by them), I am being
> swayed towards more performance based work, and thus, to some extent,
> away from my RC based idea. 

Well I have a different take on this issue... an interesting new direction
for a PhD, minded student is to figure out how to make it easier to
correctly implement the existing GC/RC algorithms. i.e. how can you make
sure that your hard to code GC/RC actually does something sensible....

I always read "very hard to code" as "A problem an underpaid PhD student
ought to attack..."

Here's a crazy idea along those lines off the top of my head... one
difficulty of getting an RC collector right and efficeient is that it usally
requires tight coupling with a compiler. i.e. the compiler has to emit and
optimize the RC code in a sensible way. 

If I want a looser coupling of the compiler and the RC scheme, can I have
the compiler just emit "hints" so that another independent phase or utility
can insert and optimize the RC operations. The big goal is to make the
insertion and optimization of the RC operations as independent as possible
from the source language... i.e. C++ or Java, or Perl or Python compilers
can all reuse the same RC infrastructure rather than rewritting it ever
time...  You probably don't want to do this on raw assembly but something
like an annotated version of C-- http://www.cminusminus.org. Actually C--
has lots of interesting GC/compiler related open problems to get solved...

I've also had this wacky idea to build a tool to generate garabage
collectors... i.e. I write a short description of the data of my languages
generated by my compiler feed it to some magic "garbage collector generator"
with some flags like "2 generations, mark sweep, ...." and have it spit out
a garbage collector as well as a library my compiler call to interface my
compiler with it...

There been some work in this area, but most of it has been based on the idea
of building generic garbage collector frameworks as opposed to building
specalized garabge collectors automatically from declarative specifications....
The specalized garabge collectors in theory should be as fast and as
flexible as a highly tuned collector written by hand... but hopefully the
specifications and total amout of time to build it would be less than the
hand tuned version. 

Hmm.. one could even probably just do this for malloc implemenations... 
Though I'm sure someone must have done that already..... (i.e. generate fast
allocators based on some profile of specific a given program.)