[gclist] FAQ update
David Chase
chase@centerline.com
Fri, 08 Mar 96 14:32:29 EST
> I've been reading the FAQ, so as not to ask too many stupid questions,
> but it didn't work. :-)
So, we need to answer this one, too. No problem.
> I came accross the following passage:
>
> > ... Most programs don't do these things,
> > so most programs work with a garbage collector. Ordinary (legal)
> > pointer arithmetic is tolerated by garbage collectors for C.
> I don't quite get the last claim. How do you reconcile pointer
> artihmetic with [for example] mark-sweep GC? I can rephrase my
> confusion as:
> How does the collector determine object boundaries in the context of
> a non-gc aware compiler so that the entire object gets marked and
> all pointers in it get followed, and not just part thereof?
The easiest way I know of is based on Big Bag of Pages. That is,
you keep an array, indexed by page number, that describes what is
on each page. Small objects of the same size are all allocated on
the same page -- large objects are allocated on contiguous pages.
Given an address, to find out where the object containing it is,
abnd how large it is, you:
1. lookup the per-page info.
2. if it is a page of small objects
A. Subtract the page start from the address to get the offset.
B. Take the remainder of the offset divided by the page size (better
if this is a power of two).
C. Subtract the remainder -- that is your object start.
And, of course, you already know the object size.
3. not small objects
A. Either it addresses the first page of a large object, or
B. It addresses a page within a large object, typically encoded
as "look N pages previous to figure out what this is".
The Boehm-Weiser collector uses this algorithm. This is also a good
trick for malloc, because it allows you to dispense with any header
words to tell how large the object is. Since the header is typically
8 bytes long, avoid this header is good if you are allocating many
small objects.
David Chase