[gclist] reference counting

Greg Morrisett jgm@cs.cornell.edu
Wed, 6 Sep 2000 07:26:25 -0400


> In MLWorks, most collections of young generations have the property
> that almost nothing survives (5-15%), so most of to-space is never
> touched, so it need only exist in VM.  VM is just disk space, which is
> insanely cheap.  If you like living dangerously, with an
> over-committing VM system, then VM is even cheaper than that (until
> one day, when you really need it, and it's not there at all).

Yes, a particularly good organization for ML is to use
copying collection for the nursery and then mark/sweep/compact
for older generations.  This is how O'caml works as far as
I know.  We didn't get a chance to compare this approach
in our tests, because isolating all of the variables was
hard enough.  We concentrated on a straight, non-generational
comparison.  

And even though we were doing almost 3x the number of
collections as BDW, and our pointer tests were more expensive,
we were breaking even with BDW in total run-time.  This is
because (a) the collector could run faster with type information
on most objects (b) there was some allocation win with
pointer bumping (c) for at least some of the applications,
especially those with objects of different sizes, there
appeared to be a win in locality for the mutator.  On the
other hand, for some applications, especially those where
the objects tended to be the same size, BDW had a win
in locality.  

We also noted that for Java or languages that require a
header word, then the MCC approach might win.  Sadly, hacking
one of the JVM implementations at the time wasn't doable
without signing our lives away.  

-Greg