GC -- some explanations.

Patrick Premont premont@cs.toronto.edu
Sat, 13 Apr 1996 18:04:17 -0400


> *Reference Counting* -- has occured to many people who have needed garbage
> collection functionality before and didn't want a garbage collector.

A significant drawback you didn't mention is that it doesn't remove
garbage that is part of a dependency cycle. If there are solutions to
that problem, I've never heard of them.

> *Copy Collection*
>    This is really a good algorithm in terms of efficiency, given
> enough memory.
> 
>     This algorithm suffers from memory waste 50% and it has trouble
> with finalization.

I'm not sure if Tunes will end up having many memory spaces but if it does,
it seems that a system-wide copying GC system that relies on paging could
achive good results :

Each process (where, for the sake of this discussion, a process is
something that has its own GCed heap) would have a heap that is not
preallocated by the system. As more data is created in it, the system
will give it as many memory pages as possible to delay garbage
collection and let more objects die (good for copying GC) (I assume
pages can be added automatically to the heaps by page fault handlers).
Note that there is no waste of memory here, each heap is full (of live
or dead object).  This is pretty neat in itself.

But the trick is that the 50% memory waste can be greatly reduced.
First of all, the memory is only needed for a short time. This means
that the system could serialize garbage collection in all the
processes so that at each instant at most one process needs a target heap.
This means that it becomes possible to allocate pages to the heaps well
beyond half of the memory. You can continue until there is just enough
pages left to create a target heap of the same size as the largest heap.
This means that only the largest heap causes a memory waste !

Then you can further reduce waste by noting that the target heap for a
copying GC must be as big as the source heap in general but in most
cases, it will not be fully used (if it is, there is no garbage and
you just need a larger heap). Some fast heuristic/statistic method
could be used to approximate the expected necessary size of the target
heap and derive a good limit for the number of pages to reserve for
that purpose. This should be done by measuring the relative speeds of
the primary memory and secondary memory to find a good compromise
between the expected memory waste and expected GC time (as there will
occasinally be some paging to secondary memory to complete the GC when
the target heap size was underestimated). All this should lead to
a very efficient use of resources.

What do you think about the above suggestions ? I haven't seen them
anywhere in the litterature although they must be there somewhere.

> *Generational GC*  
>    This is mostly an optimization from the observation
> that most objects tend to die young, but those that don't tend to last
> quite a while.

Note that my previous suggestions should work just as well with
generational GC. In fact it is even better since the heaps are smaller
because they are divided in generations. 

So how about that ?