GC -- some explanations.

Patrick Premont premont@cs.toronto.edu
Mon, 15 Apr 1996 17:25:05 -0400


> 
> >[One a many-heap Copying GC system,
> >only space the largest heap needs be reserved]
>    I wish that'd be true.
> However, in the case there be circular references *accross* heaps,
> major GCs (not the usual ones, but eventual ones),
> would have to be done *simultaneously*.

Yes, by seperate heaps I meant really seperate (no pointers come in
from the outside). A list of such pointers could be maintained when
they are not exactly separate to do some conservative GC as in a
generational GC.

But I presume that there will be many cases where we want a heap that
may point to some outside heaps (which may not be GCed, like a bunch
of code, do you call these heaps?) but isn't pointed to, or rarely
pointed to.

> Surely, such major GCs seldom need be done,
> but they must be done sometimes.
> Thus, either swap space must be reserved for all target heaps,
> or the major GC must be done with a mark&sweep algorithm
> that doesn't need a target space.

Those heaps which aren't pointed to (I'm not sure how frequent that
will be) can be GCed in s special way just before the major GC
so that the outgoing pointers are remembered and used as part of the
root set in the following major GC.

> I favor the last solution, if only it doesn't impose too harsh constraints
> on the heap (that would have to prepared for double GC).

Yes. A check for the total size of not-pointed-to heap could be done first,
as well as other check for the expected savings associated with any other
applicable trick, before resorting to aglobal mark&sweep method.
Litterature on global GCs in distributed systems should contain tricks
to restrict the scope of GCs (just a guess, I havn't read about that).

So Fare, does the fact that you're trying to find solutions for major
CGs when there are dependant heaps mean that you like the basic idea
of what I suggested (open ended heaps, letting the system add pages
until it feels it's better to stop and GC, serialized GCs for
independant heaps,...) ?

> Please also note that a copying GC thread does need stack space proportional
> to the longest pointer chain in addition to space equivalent
> to the total allocated space,
> while a mark&sweep GC rather needs an additional field in each object,
> or equivalently space proportional to the number of objects allocated.

The Cheney algorithm for copying traversal doesn't require any stack, it
does a breadth first traversal where the memory of the traversal (the list
of open nodes) is conveniently embeded in the to-space. It keeps two
pointers Scan and Free in the to-space. What is between those pointers
are the open nodes. What comes before Scan is done and what comes after
Free is free space where the copying will continue. The process is basically
of looking at the node at Scan and if it points to the from-space, you copy
that pointed node just after free and advance both pointers accordingly.
The details are slightly different to ensure that data isn't copied twice
(using forwarding pointers). The process ends when Scan meets Free.

Isn't it cool ?

Patrick