[gclist] Hughes' algorithm

Paul R. Wilson wilson@cs.utexas.edu
Thu, 21 Mar 1996 11:21:44 -0600


>From shapiro@prof.inria.fr Thu Mar 21 02:40:04 1996
>Subject: Re: [gclist] Hughes' algorithm
>
>Conceptually there are many traces occuring in parallel, but at one process a
>single trace thread runs periodically.  I.e. multiple traces do not overlap at
>any single process.

I think it might be better to say that multiple traces do overlap at a
single process, but they're optimized into one trace by combining
their marks---propagating the highest mark (timestamp) implicitly propagates
all of the lower ones too.

> || Second question: is the trace a snapshot trace, or an incremental update
> || trace?
>
>I don't think the paper says anything about this, and I don't see why the
>question is relevant.

I've never been able to decide whether an incremental algorithm is correct
without modeling it in terms of tricolor marking.  It's the only way
I know to make the invariants and strategy simple and clear.

This is a problem I have with most of the distributed garbage collection
literature---there are lots of proofs lying around, but I don't think
most people understand anybody else's algorithms very well.  The textual
descriptions are unintuitive, and the proofs are entirely unreadable.
I suspect that some of the proofs are wrong, and most are just too
baroque to understand and check.

In the case of Hughes's algorithm, I've never even been confident that
the minus-oneth order simplified version is correct.   It's a pipelining
of multiple tracing traversals, but I don't even know for sure whether
he's doing a correct single traversal before applying the pipelining
fanciness.

In a lot of papers, I think the handling of message passing is way too
complicated.  It would not be so hard if people just thought of it
in terms of tricolor marking---and use of the tricolor marking would
make it clearer where conservatism creeps in with respect to the
handling of messages.  (Most people seem to mark anything in a message
black.  This can introduce considerable conservatism if you're allocating
white.  Anything short-lived that's ever reachable from anything in a
message will be considered live.