[gclist] zillion objects

Nick Barnes Nick.Barnes@pobox.com
Fri, 24 Aug 2001 12:46:17 +0100


At 2001-08-24 07:00:43+0000, "Ji-Yong D. Chung" writes:
> Hi
> 
>     In my preceding email,
> 
> > [I wrote]  This question is a general one. In brief:
> >if you have a large number of objects, a small
> >fraction of which die (very slowly), is there a
> >method of automatic memory management
> >(i.e., gc), which would not (1) try to copy most of
> >those objects or (2) trace most of those pointers?
> 
>     Perhaps some background material on this would
> be helpful.
> 
>     Basically the question asks for a strategy for 
> implementing an automatic memory manager 
> (perhaps GC) for a high-performance servers that 
> does lots of caching.
> 
>     With today's decline in memory price, servers like 
> to put *everything* in memory (few gigs?).  Garbage
> collectors or automatic memory managers for such 
> servers, need to be able to handle large
> data without significant collection pauses.  The 
> emphasis on caching will increase as 64 bit machines
> become more prevalent.  
> 
>     I don't know if generational collector would be
> good for the above described large cache oriented servers.
> A generational collector will push all stable data to
> its oldest generation arena.  And eventually, it will
> still need to perform garbage collection on that
> generation.
> 
>     When it does, it has to scan perhaps up to few gigabytes
> of data (perhaps tens or hundreds of gigabytes in
> extreme cases).  I am guessing that this has to 
> result in significant GC pauses.
>
> When one has so much data on RAM, cache
> miss caused by client requests will be relatively
> rare.  This means the rate of garbage generation,
> (garbage generated by replacing pointers to old 
> object by the poiner to a new object read in from a disk
> due to the cache miss) would be slow as well.

It is certainly true that efficient garbage collection of gigabyte
heaps of long-lived is difficult.  But such a collection need not
involve long pauses.  Collection should be done incrementally, with a
number of short pauses.  Or even in parallel, with hardly any pauses
at all (many of these servers are SMP machines).

Your memory manager could also partition the heap in some way and use
remembered sets to find pointers between partitions.  That way it can
collect one partition at a time, using the remembered sets to avoid
scanning most of the heap.  You can think of the partitions as
generations, but there's no reason for the partition to be on the
basis of object age.  The success of this depends on the precision of
your remembered set implementation.  If you use a write barrier to
maintain your remembered set, the efficiency depends on the frequency
of pointer writes.  If you get a lot of write barrier hits for one
area of memory, take the barrier for that area down and record that it
may contain pointers to anything.  You will probably find that most of
your very large heap does not get updated very often; if you are able
to choose your partition well then you should be able to get an
efficient remembered set.

Note that incremental collection uses memory barriers (read- and/or
write- protecting areas of memory), so is only possible on some
platforms.  The same is true of many implementations of remembered
sets.

Sketch of incremental collection:

Partition the heap into three colours: black, grey, and white
(tri-colour marking).  This partition obeys the tri-colour invariant,
namely that no black object points to a white object.  You are going
to change the colour of objects while maintaining this invariant.  At
the end of the collection the mutator is black and there are no grey
objects, so the white objects are dead and can be recycled.

In the initial partition, the condemned set (the part of the heap
which you are going to collect) is white.  The mutator (stack,
registers, etc) is grey.  Use the remembered set implementation to
identify which other parts of the heap don't contain any pointers to
the condemned set.  Colour these parts black.  Any other parts are
coloured grey.

The basic unit of work for the collector is scanning a part of the
grey set, looking for pointers to white objects.  If it finds a
pointer to a white object, it colours the white object grey.  When it
finishes scanning a grey object, it can colour it black.

While the mutator is grey (i.e. has pointers to objects of all three
colours), it has pointers to white objects and may attempt to write
these pointers into black objects.  You have to maintain the
tri-colour invariant, so you write-protect the black set.  If the
mutator attempts to write (a white pointer) into a black object, the
collector catches the protection signal and resolves the situation in
one of two ways: either (a) by colouring the black object grey (or
white) or (b) by colouring the white object grey.  Option (b) is
usually preferred because it maintains forward progress through the
collection.  However, if your grey set granularity is large (e.g. if
you have to grey a whole page of memory in order to grey a single
object), you may prefer option (a).

At some point, the collector decides to scan the mutator itself
(i.e. stack, registers, etc), turning it black.

While the mutator is black, it mutator only has pointers to black and
grey objects.  It could get a pointer to a white object by reading it
from a grey object.  You have to maintan the tri-colour invariant, so
you read-protect the grey set.  If the mutator attempts to read (a
pointer to a white object) from a grey object, the collector catches
the protection signal and resolves the situation in one of three ways:
either (a) by colouring the mutator grey again (and switching the
barriers back around), or (b) by scanning the grey object, turning it
black, or (c) by turning the white object grey.  (c) is an
optimization of (b), because if you scan the grey object then you will
turn the white object grey anyway.  (a) is usually not a good choice.

Ravenbrook has a toolkit for writing a collector like this, and the
expertise necessary to make it work.  For more information, see
<http://www.ravenbrook.com/services/mm/>.

Most of the technical terms I have used here are described in the
glossary of our Memory Management Reference
<http://www.memorymanagement.org/>.

Nick Barnes
Ravenbrook Limited