[gclist] Hughes' algorithm

Paul R. Wilson wilson@cs.utexas.edu
Fri, 22 Mar 1996 09:37:41 -0600


>From shapiro@prof.inria.fr Fri Mar 22 03:04:43 1996
>Subject: Re: [gclist] Hughes' algorithm
>
>Topic is Hughes' algorithm for doing a distributed trace.  A reference is
>provided at the end for those who would like to jump in.
>
> || From: wilson@cs.utexas.edu (Paul R. Wilson)
> || Date: Thu, 21 Mar 1996 09:51:04 -0600
> || 
> || Suppose I have a very long list, which takes a very long time to traverse.
> || Initially, everything has mark 0---it's never been traversed at all.
> || For simplicity, assume that this list is so long we'll never actually
> || finish tracing it.
>
>Then the global minimum (GC timestamp threshold) will never get any higher
>than 0, and nothing will be collected.  It's safe.
>
>As long as ANY process still has a 0 mark to propagate, then the global
>minimum will not make any progress.

OOPS, I messed up my example---I should have made the list finite for
clarity.  (Actually, I assumed that the new objects were created with
0's already on them, i.e., allocating black with respect to GC 0,
so that there would NOT be unpropagated 0's.  The example will work with
a finite list, though, so I'll go with that.)

I need an example where there are no marks _left_to_propagate_, but
there are still 0 marks.

Suppose the list has 10 elements, and during the 5th GC, you end up with
the list being marked this way:

    5 -> 5 -> 5 -> 5 -> 5 -> 0 -> 0 -> 0 -> 0 -> 0

That is, any marks for 2, 3, and 4 that started down the list got clobbered
by 5's.  This is not only possible, but not particularly unlikely, if I'm
not mistaken, because you preferentially propagate high marks.  (You
won't get this in a local list, because the local traversal propagates
the highest mark as far is it will go locally, but you may get it in
a distributed list, I think.)

Now suppose for simplicity that there are no unpropagated marks except 5's,
anywhere in the system.  There are only lists like the above, with
5's not yet propagated, and tails that are all zeroes or all 1's or
whatever.  (For this example could even assume that there are no 1-4's
anywhere in the system, propagated or not---they've all been clobbered
by 5's.)

How do we detect termination?  Clearly, the above situation is one in
GC 0 has terminated and GC 5 has not, but it seems to me that GC's
1 through 4 haven't terminated, either---things that were reachable
at times 1 through 4 are still reachable.

My understanding was that Hughes algorithm determines GVT (termination)
by *unpropagated* marks, not just marks.  So the above situation will
look like GC's 1 through 4 have all completed.  (I don't see how it
could use the presence of propagated marks to detect non-termination;
that's the information it uses after termination to discriminate
between live objects and garbage.)

By clobbering lower marks with 5's, it seems to me that we're throwing
away crucial information, which says "I haven't finished propagated
this mark down this path, so I'm not finished with this GC."

It still looks broken to me.  

> || I start a mark 1 going around the system, and it makes a little headway
> || on this list.  Later, I start mark 2, and again later I start mark 3.
> || 
> || At some point, all of the 1 marks may get clobbered by 2's or 3's,
> || even if the list has never been traversed completely.
>
>That's fine.  That means that everything that could be reachable from a 1 mark
>is also reachable from a 2 or 3 mark.

Right, I agree that that stuff is definitely live.  What I'm worried about
is whether the GC will erroneously detect termination of the 1 GC, and
think it's garbage.   That is, it may detect "termination" of the 1
GC (no 1 marks that haven't been propagated as far as they will go) when
there are still things that _could_have_been_ reached by 1 marks---but
weren't because they got clobbered by later marks and 1 propagation stopped.

(E.g., all the objects marked 0 in the picture above were reachable by
1, but all the 1's disappeared before being propagated all the way.
They're hidden in the 5's.  GC 1 looks like it has terminated, but it
hasn't really.)

>If there are no more 1 marks left to be
>propagated, that means that all 1 traces have been caught up by 2 and 3
>traces, so it's safe to increase the threshold to 1.

I don't see this.  Since the 2 and 3 marks have hidden the 1 marks, that
means that the only way we can detect termination of 1 is by propagating
all of the 2's and 3's as far as they will go, to see what the 1
_might_have_ been able to reach if we hadn't clobbered its marks.

And so on---as long as there are higher-numbered marks in the system,
you can't terminate because you don't know what information about
earlier GC's has been lost due to clobbering.