[gclist] Allocator algorithms.

David Chase chase@centerline.com
Mon, 20 May 96 16:55:56 -0400


> Under the conservative collector operating at a page level
> is a storage allocation algorithm. By operating at a page
> level I mean that requests smaller than a page are done by
> breaking up a page taken from the allocator.

> Anyway we are unhappy with circular first fit

I've never heard of it working well for anyone, actually. I'm pretty 
poor with remembering the names of these algorithms (another nice 
thing to summarize for the FAQ, actually) but for big blocks (multiplie 
of pages size) what intuitively seems right is to always prefer lower 
addresses.  (What is this called?  Memory-ordered first-fit?) Even 
if you get fragmentation, you can always use it for additional memory 
for a small-block pool in the future.  The intuition here is slightly 
informed by the failure modes of the circular policies -- they have 
a tendency to fragment all of memory.  MOFF on big blocks (for lack 
of a better name for it) concentrates fragmentation at one end of 
the address space.  You may pay a little more in searching to get 
a really big block, but so it goes.

> We are looking for Cartesian tree agorithms
> and best fit.

You might try Cartesian tree.  If you wish to do a quick-and-sleazy 
experiment, one (SunOS or Solaris, probably SunOS) of the Sun memory 
allocators uses Cartesian trees (it's the really slow one that doesn't 
use much memory -- I've tried to beat it myself in the past, and 
it is tough).  You could, for instance, keep a trace of big-block 
allocs and frees, and then feed it to the Sun allocator and see how 
it does. 

> An interesting story of how simple probabilistic thinking
> can go far astray. If you allocate large objects out of
> 4K pages you would expect the average loss to be about
> 2K. It is in fact far higher.

> Programs often allocate in blocks that are multiples of
> 4K, they know about page sizes too. The collector adds a
> byte at the end so that pointers one off the end of an
> object will keep it alive. 4K + 1 needs a block of 8K.

(Note, of course, that this is an artifact of a C and C++, and
not terribly many other languages.)

A heuristic that you might try is to alternate big-block pages with 
little-block pages, and leave the first little block unallocated.  
Obviously, this only makes sense one or two-page big blocks, else 
you are courting fragmentation disaster.  For anything note that 
once you get past 2 pages, adding an additional page is "only" 33% 
fragmentation (allocating 4 pages for a 3 page request).

Another potential use for that storage, is for temporary storage
during other GC-like activities.  For instance, I think it would 
be perfect for your mark stack/queue.  That's a temporary use, meaning 
that it won't pin pages and contribute to future fragmentation, and
it is a peak use (you only mark when you GC, you only GC when you 
need storage).  You could also build "mobile" allocator data structures
that are eligibile for relocation after a garbage collection (simply 
put, whenever the allocation resonsible for a "hole" comes free, 
evacuate the remainder of the page if another hole is available.
Always prefer lower addresses).

David Chase