[gclist] Re: gclist-digest V1 #70

Giuliano Carlini giuliano@ix.netcom.com
Mon, 10 Jun 1996 12:52:23 -0700


Gustavo,

First off, please call me just G, or if that is too short, Giuliano. 
Carlini is far too formal.

Looks like good work. I'll be looking at it more carefully in a few days. 
I've got a couple of quick comment though.

>I did not want to publish this document before I had a working
>implementation that could prove the usefulness of the algorithm.
..
>I strongly think that a running prototype is necessary to completely
>show that the algorithm is correct, works, and it is useful.

I believe that this is essential. Without a real implementation, 
everything is just conjecture. While I intended to do this, it was going 
to take a long, long time. Before investing the time to do this, I thought 
it best to see what had been done already. I may still try to implement 
it. I may make a good product.

How is your proto-typing going?

>My apologies for not making this information available before.

None are needed. Our circumstances are much different.

>It stores all the
>back-tracing status in a single token-message to allow simultaneous
>back-tracings (iii.). It uses the RPC counter of exported objects as
>the barrier to verify the validity of back-references (iv.).

At first, I though about using this approach. I mailed a prior version 
using this suggestion to Paul Wilson asking for his impressions. I 
scrapped it for two reasons. First, due to the problem you note: the 
message size increases with the length of the chain. As a chain becomes 
very long, the cost of transmission becomes very expensive. It has been 
far to long since I studied complexity theory, so I may be wrong on the 
following. I believe that the cost of transmission is the sum of 1 to N = 
N/2(N+1). This "reduces" to a cost of O(N^2). I believe the transmission 
costs of algorithm's which I, Matt, and Arturo propose are O(N).

However, this is very unlikely for most "reasonable" programs.

If you continue with this approach, you may want to consider an 
optimization I rejected in favor of my current approach. You need only add 
a zombie to the list if it has more than one reachable stub. This works 
because the only way to reach a scion in this list is through either the 
original zombie or one which branched out. Thus, the cycle would have been 
detected at those points, and there is no need to include scions which 
reach a single stub in the message. While the cost is still O(N^2), the 
proportionality constant is dramatically reduced.

While in practice it is unlikely for the back-trace messages to get large, 
it will happen occasionally. I'm not sure, but I fear your approach will 
fail catastrohpically with this approach.

The second reason is that this approach is likely to duplicate the 
back-search many times. It is very likely that several nodes on the cycle 
will start a search "at the same time". If you allow them all to proceed, 
this increases the cost to O(N^3). If you suspend or abandon some, you 
must deal with deadlock.

>Carlini's solution will fail in the following case:
>For example, assume that zombie z1 reaches stub (proxy) p2.
>Then a back search examines p2, and start a search on scions that
>reach p2. It returns that no local reference was found. Then there is
>an RPC on z1 that assigns to a global variable a pointer to p2. Now p2
>is reachable from a local root. The back-search starts the second pass.
>Since no scions were created, the back search reports that p2 is
>unreachable.

I belive this can be handled by making z1 dirty. The effect is similar to 
your proposal, but requires a bit less storage.

I'll read your message a bit more closely, and review the other documents 
you mention as soon as I can.

Thanks for your comments,

g