GC -- some explanations.

Eric W. Biederman ebiederm@cse.unl.edu
Sat, 13 Apr 1996 14:41:23 -0500

My main source on GC has been I think an ACM journal on a workshop
they hand on the subject.  I haven't been able to look up the
reference but here is a quick summary of what I know.  I know that
Knuth has at least one garbage collection algorithm in his books.

Discounting Real-time collectors & distributed GC 
there are essentially 4 types of garbage collection.

Mark & Sweep
Copy Collection
Reference Counting
Generational GC

*Reference Counting* -- has occured to many people who have needed garbage
collection functionality before and didn't want a garbage collector.
It requires an extra word to count how many references there are to a
object and when there are no more references it deletes that object.
It works well for real-time collection because it distributes out the
times of collections out into other code.  

  It does though have the inefficiency that cared to the extreme it
can get into the innerloop of procedures and other things and
performance is then degraded because you spend a significant fraction
of your time incrementing and decrementing references.  Inside an
inner loop where you make a second copy of a pointer to an object and
then destroy it every loop iteration can be horrible.

*Mark & Sweep* -- This is a very reliable technique, that works in two

  The first pass starting at root nodes (objects that you never
garbage collect but have pointers inside them) it traverses the whole
pointer tree. Marking each object it can visit as visited.  

  The second pass traverses the whole heap and deletes any object not
marked as visited.  And resets all objects that have been visited as
not visited in anticipation of the first pass coming again.

  There are some space optimizations to this scheme where the visited
objects are relocated to as tightly together as possible to prevent
memory waste.  This space compation allows stack allocation to be used
for memory allocation, much faster then malloc.

  The disadvantage of this scheme is that you must traverse the whole
heap, which is quite tedious, and it assumes the mutator (everthing
but the garbage collector) is stopped while all of this is going on.

  The advantage is that if you have something that you are not sure is
a pointer, it may be an integer you simply mark the object as having
been visited, but don't move it, and you can collect from things where
you don't know where the pointers are.

*Copy Collection*
   This is really a good algorithm in terms of efficiency, given
enough memory.  As it's efficiency is mostly proportional to the
amount of memory you have.  It splits your heap memory into two same
sized heaps, only one of which is used at a time.  You use stack
allocation with this method and all objects are assumed to be
moveable. (No pointer interger confusion allowed).

   The collection works as follows.  First the mutator is stopped and
then the root objects are follwed to heap allocated objects.  These
heap allocated objects are copied onto the other heap and a forwarding
pointer is installed, where the old object was. Then the next object
that refers to that object has it's pointers adjusted to point to the
new copy. Anyhow evetually all of the reachable objects are traversed
and copied onto the other heap.  Then the garbage collection is done,
and the roles of the two heaps are reversed.

   This algorithm gets is more efficient in general then mark&sweep
because it only needs to travers the live objects and never needs to
collect objects from the rest of the heap.  Also the longer you put
off collecting garbage the more objects there are that never need to
be collected, (garbage collection happens when the current heap is
full) so it can be more efficient.

    This algorithm suffers from memory waste 50% and it has trouble
with finalization.  That is there are some memory objects that hold
resources (locks, file descriptors, etc) that need to be release
before their space is recycled.  Objects that need finalization are
relatively rare however so this is not a major problem.

*Generational GC*  
   This is mostly an optimization from the observation
that most objects tend to die young, but those that don't tend to last
quite a while.  What is done is is you essentially have two heaps as
with Copy Collection but they are dynamically sized.  The old objects
heap, and the young objects heap.  As the heaps can be purely
conceptual it doesn't have any special requirements.  

   The objects are collected in the young heap by some a standard
algorithm, and after so many times surviving garbage collection the
objects are promoted to the old heap.  The heap of young objects is
convetionally collected only when it gets full, and old heap is
collected convetionally when the young heap is full and garbage
collection doesn't significantly reduce the size of the young heap.

   The greatest difficulty in this approach is traversing the young
heap withough traversing the old heap.  A common solution is to keep
track of writes to the old heap, and when a pointer into the young heap
is written to the old heap, find it, and have a whole list of those
pointer that can be used as roots, into the young heap.  

   An important point to make about this method is that the young and
old heaps don't need to be collected with the same algorithm.  A
really nice optimization, is that you can collect the old young heap
with a copy collector, and the old heap with a compacting collector
allowing almost total memory use with very good performance.

   The most important point as far as I can tell in a garbage
collector backed by virtual memory is to keep the paging effects low,
as it traverses it the memory.

   As for locating the start of objects from pointers into their
middles, a really simple technique would be to have a stack of
pointers to the start of objects in a stack allocated heap.  This
stack could then be searched using a binary search to find the start
of the object.  The worst case for such a probe is less then 20 memory

    There is a lot more I'll see what I can find by way of algorithms
etc.  I think I've got how to implement the basic traversal algorithms
around somewhere.