[gclist] How a conservative collector turns potential pointers into descriptors.

Charles Fiterman cef@geode.geodesic.com
Thu, 22 Feb 1996 08:30:13 -0600


> > As it happens, it is also easy to write a conservative GC.
> > 
> Well, I guess I must be dump, or misinformed, so please help me. My
> initial idea of a conservative GC was about a system which does
> scan memory, and uses heuristics to determine what might be valid pointers.
> If this assumption is true, it maybe is easy, but seems like very low-level
> stuff to me. How efficient can such a test be ?
> 
> It maybe is easy to write, but how easy would it be to optimize ?  What 
> happens with portability ? If I am wrong regarding what such GC systems 
> are, please let me know.

There are some public domain conservative GC's on the market and we have
one we are marketing. While there is work on each port it is not crushing.

The test for valid pointers is not a heuristic and is very fast. Let me
explain it.

All heap memory comes from the system via something like sbrk(size_t n)
It will all be in some range which you can keep track of. If a potential
pointer is not in that range its not a pointer. For the entire heap range
keep an array of descriptors, one for each HBSIZE where HBSIZE is something
like 4K that works well with your paging system. Given a pointer in the
range simple arithmetic gets you to the the right descriptor. If the descriptor
says NOT_OURS then its not a heap pointer. If it is and points to
a large object >= HBSIZE then the descriptor tells all about the object.
If it points to a small object that object will be allocated in a block
with a lot of other small objects the same size. When a user asks for
a 12 byte block allocate an HBSIZE of 12 byte blocks and give the user 
one. The next 12 byte block comes out of this block. So a pointer to this
block is quickly turned into a pointer to the right object descriptor.

All this is very simple. But doing it well, being able to replace the whole
of the compiler's memory managment system down to the last _heapwalk() and
_dbg_malloc() is lots of work. Like anything else the difference between
a quick system and a commercial product is huge. There are huge numbers
of little improvments.