[gclist] Finalizers & Reference counting.

David Chase chase@world.std.com
Fri, 30 Aug 2002 19:10:42 -0400


At 07:56 AM 8/30/2002 -0400, Jerrold Leichter wrote:
>This whole discussion does bring up an interesting point, however.  Some
>programs - compilers are a great example - go through phases with the
>property that:
>
>        a)  Each phase allocates and manipulates a significant amount of
>                memory, much of which remains live throughout the phase;
>
>        b)  Much of the memory allocated in a phase goes dead when the
>                phase ends.
>
>(In many cases, the "much" memory is actually identifiable at the point of
>allocation.)

In my experience, compilers tend to gave generational collectors
a hard time, especially when compiling a large number of files.
The first big file that gets compiled stuffs a lot of crud into
old space, and then it lingers for a lot longer than it ought to.

>The question is:  How could you capture the same advantages in a GC'ed
>system?

As a very crude approximation, you can simply run a full GC when you know
that the live storage ought to be near a minimum.  In a compiler, that
would be just before compiling a new file, and perhaps between phases.
Since GC-cost is roughly proportional to live, you get the maximum
bang (freed space) for the minimum buck (time spent in GC).

Admittedly, this is crude.

You could do better.

- If you have generational infrastructure (e.g., you are marking
  cards, and typically you mark blindly because it is faster), and

- If you can keep track of creation time (in a sliding-compacting
  collector, an object's address is a proxy for its creation)

then it's not too hard to request a GC of everything younger than
a certain age.  This would probably be cheaper than a full GC.
Depending on how you implement your card marking, you might be able
to collect *only* that region, but you might not have the necessary
cards from younger to older.

So, your interface could be something like

  GCTime gct = gc.getTimeStamp();
  ...
  // Insert compiler phase here.
  ...
  gc.collectYoungerThan(gct);

A more precise interface would be

  gc.collectBetween(gct_begin, gct_end);

Of course, an implementation is free to take these as a mere hint,
and either not collect, or collect everything.

David Chase