[gclist] Re: gclist-digest V1 #74

Giuliano Carlini giuliano@ix.netcom.com
Sun, 16 Jun 1996 11:46:36 -0700


As I noted to Hans, if the choice is between you or I being in error, I'll put my money on 
your being correct. But, I can't figure out where I'm wrong, so I have to pursue this too the 
grisly end ;->

>>Why is it cheaper to go forward? 

I'm still curious about how to do this, and how you handle the case I presented?

>>The local IRG is computed during the local GC
>>very cheaply.
>
>I admit the local collector may compute the Inverse Reference Graph, which
>will bring the cost of a reverse search down.  However, remember that a
>standard tracing collector goes through any reachable object only once.

You are correct. In my proposal, what I call the "RV scanner" must trace through those nodes 
it has seen before, so that the partial IRG it is constructing will be correct.

However, the RV scanner runs after a standard local scanner which traces through the local 
roots: registers, stacks, and globals. The partial IRG which the RV scanner contains only 
those stubs which aren't locally rooted. Thus, the RV scanner does not need to trace through 
any node traced by the local scanner.

Perhaps I'm wrong, but I expect the vast majority of reachable objects to be locally 
reachable. Further, I expect that usually an object will have only a single pointer to it. At 
least, the work on one bit reference counting seems to demonstrate this. This implies that 
typically, the additional work placed on the RV scanner to compute the IRG is close to O(n) 
with respect to the number of locally unreachable stubs. I'm not sure about this; it might be 
with respect to the number of scions. In either case, this typical cost is quite low.

But, in really deviant data sets, the additional cost might be O(n^2).

It would be interesting to examine real programs to construct a histogram of the number of 
pointers to each object. I think it would show that my assumption of typically one pointer to 
an object is correct. You might disagree. But, we can guess all we want. Only an experiment on 
a large number of programs will produce any results we can rely on.

>Also, as soon as the mutator changes something, your local IRG may become
>inconsistent with the reachability graph.  This may or may not be important
>depending on the details of the algorithm.

Tracing the IRG requires that each stub can discover whether it is reachable. If a stub is 
locally reachable, it will know this, and we don't need the remote IRG for it. Thus we need 
only construct a partial IRG for the locally unreachable stubs. Once a node is locally 
unreachable, it will never become reachable due to the actions of the local mutator.

However, while the local mutator can't make a stub locally reachable, some remote mutator may. 
A locally unreachable stub's IRG can be altered, if the space recieves a message which 
mentions one of the scions which can reach the stub. This can be detected by maintaining a 
barrier. Any scion mentioned by a message is marked dirty. Among other conditions, a stub is 
considered reachable if one of the scions in its IRG is marked dirty. This may temporarily 
prevent an unreachable stub from being collected. But that isn't a big deal. What is important 
is that this guarantees that a stub is not reclaimed too early.

g