[gclist] ref-counting performance cost

Andrew Chen chenandr@cse.msu.edu
Wed, 11 Oct 2000 17:34:29 -0400


At 11:32 AM -0400 10/11/00, Daniel Wang wrote:
>Andrew Shi-hwa Chen <chenandr@cse.msu.edu> writes:
>{stuff deleted}
>>  > I totally agree that RC collectors are very hard to code. One slip and
>>  > you've got a leak... My view is that RC collectors deserve more research
>>  > because they are so popular. RC collectors are interesting because of  the
>>  > optimisations that can be worked. RC collectors can be used in real time.
>>  > Add to that the spice of work-arounds for the cyclic problem makes for a
>>  > fertile research area. I don't understand why academics shun working on RC
>>  > collectors. Is it easier to publish papers on GC than RC?
>
>{stuff deleted}
>>  And thus, since the area of garbage collection
>>  (particularly publishing therein) is seen as highly
>>  performance driven (or at least percieved as such by them), I am being
>>  swayed towards more performance based work, and thus, to some extent,
>>  away from my RC based idea.
>
>Well I have a different take on this issue... an interesting new direction
>for a PhD, minded student is to figure out how to make it easier to
>correctly implement the existing GC/RC algorithms. i.e. how can you make
>sure that your hard to code GC/RC actually does something sensible....

"correctly implement" implies two possibilities:
	provably correct (use of a formal automated proof system)
	probably correct (use of someone else's encoded knowledge)

You seem to be advocating the latter, correct? (The former I flirted 
with but ultimately have decided to abandon. One of my former 
advisors was into formal methods - one of the reasons that person is 
a _former_ advisor.)

>I always read "very hard to code" as "A problem an underpaid PhD student
>ought to attack..."

And I read it as lots of man hours spent causing occupational hazards 
like RSI and worker's comp payments. (I just got that from coding too 
much stuff.)

>Here's a crazy idea along those lines off the top of my head... one
>difficulty of getting an RC collector right and efficeient is that it usally
>requires tight coupling with a compiler. i.e. the compiler has to emit and
>optimize the RC code in a sensible way.

Can we say not threadsafe?

>I've also had this wacky idea to build a tool to generate garabage
>collectors... i.e. I write a short description of the data of my languages
>generated by my compiler feed it to some magic "garbage collector generator"
>with some flags like "2 generations, mark sweep, ...." and have it spit out
>a garbage collector as well as a library my compiler call to interface my
>compiler with it...

I've had the same idea, and was going to do it for Java until I 
learned that there is dynamic class loading in Java. I was 
envisioning not even writing a description of the data of the 
languages - just letting an enhanced compiler create it (so that I 
could use typing information to determine for what objects using RC 
would be safe).

>There been some work in this area, but most of it has been based on the idea
>of building generic garbage collector frameworks as opposed to building
>specalized garabge collectors automatically from declarative 
>specifications....
>The specalized garabge collectors in theory should be as fast and as
>flexible as a highly tuned collector written by hand... but hopefully the
>specifications and total amout of time to build it would be less than the
>hand tuned version.

I'll keep this in mind. Maybe I should start with a GUI for 
configuring and installing the BDW collector?

>Hmm.. one could even probably just do this for malloc implemenations...
>Though I'm sure someone must have done that already..... (i.e. generate fast
>allocators based on some profile of specific a given program.)

It has been done for malloc implementations.

http://www.xanalys.com/software_tools/mm/bib/full.html#grun92