[gclist] Cheaply computing Inverse Reference Graph.

Giuliano Carlini giuliano@ix.netcom.com
Tue, 18 Jun 1996 14:53:55 -0700


A few people have asked about the cost of computing the IRG. I've made 
the claim that it's cheap, and have given partial justifications for 
this belief in prior messages. I thought I'd consolodate them plus add 
an improvement.

As Matt Fuch's has noted, the SSP chain approach constructs a close 
approximation to the IRG between spaces. Close enough that it can be 
used in place of the IRG without inducing a premature reclaim.

What follows is an algorithm to construct a partial IRG through a 
space. This IRG is incomplete, but sufficient for tracing to determine 
if a "zombie" object is dead or live.

The problem with tracing the IRG is that constructing it is an O(n^2) 
algorithm. Computing the full IRG would be much too expensive. The 
solution is to compute only a small part of the IRG. We need to exclude 
as many objects from the computation as possible.

First we can exclude everything but stubs. These are the only objects 
examined by the distributed collector. Second, we can exclude locally 
reachable stubs; these are obviously reachable. Third, we can exclude 
all stubs reachable from non-zombie scions.

If a scion has been recently mentioned in an RPC message, then it is 
most likely alive. Therefore, it is treated as live. If we treat it as 
live, then any stub reachable from it should be treated as live. Thus, 
the IRG should not be constructed for any stub reachable from a 
recently mentioned scion. This won't change the objects reclaimed by 
the distributed collector I initially presented, since it already 
treated recently mentioned scions as live. This change will not only 
make building the IRG faster, but it will slightly improve the speed of 
the distributed collector, since it will now never even reach recently 
mentioned scions.

In my first draft, local collection was split into two parts:
	- scanning the local roots: registers, stacks, and globals
	- scanning the scions.
Each part used a different scanner. I called them the "local" and "RV" 
scanners. "RV scanner" was never a good name, but isn't even 
descriptive any more. I'll now calling it the "IRG builder".

Now scanning is split into three parts:
	- scan the local roots with the local scanner.
	- scan non-zombie scions with the local scanner.
	- scan zombie scions with the IRG builder.

It is easy to ensure that almost all scions will not be zombies. 
"Recently mentioned" can be defined according to past history on a per 
scion basis. Thus, even scions which normally lie dormant for long 
periods of time aren't considered zombies.

While quadratic behavior will occur, it should be very rare. Any ideas 
on how to reduce its incidence further?

g