[gclist] Length of dependency chains

Jerry Leichter leichter@smarts.com
Thu, 17 Apr 1997 09:29:05 -0400


In his discussion on guardians, Hans-Juergen Boehm made the following remark 
about the length (N) of dependency chains seen by the collector:
| My argument is that based on my experience, admittedly with a
| different collector, this has never mattered.  I suspect the reason is that
| N was never larger than 2 or so, at least after proper treatment of pointers
| not needed for finalization.  (I would guess that actual path lengths can
| be significantly longer.  But N here is the number of finalizable objects
| in the chain, not the length of the chain.  And finalizable objects are 
| usually rare.)

This brings up an interesting point:  Just how long *are* typical dependency 
chains?  That is:  What is the expected length of the shortest path from a root 
to a live element?  I suspect that, for most programming styles, not only is the 
expected value quite small, but that the distribution is heavily skewed toward 
small values, with outliers being quite rare.  (One can certainly construct 
arbitrary long linear chains; and if you use heap allocation for closures, you 
get a chain as long as the dynamic call chain - with a stack, the whole stack is 
in the root set.)

The reason this might be interesting is that one might conceivably be able to 
use the assumption of short paths in a GC algorithm.  Consider the following 
analogy, which I've been thinking about for years but have never been able to 
translate into a practical algorithm:  You've got a room with a large bunch of 
linking rings on the floor (the free space).  On the ceiling are a bunch of 
hooks (the root set).  You can hang rings off the hooks, or off each other.  As 
long as there is a path from a given ring to a hook, it remains suspended in the 
air.  As soon as the last path is broken, the ring falls to the floor (i.e., is 
freed).

If you do this with real rings, ignoring tangles among chains (e.g., get a 
supply of that famous frictionless lubricant used in constructing examples for 
elementary physics text), "dead" rings begin to fall to the floor "immediately". 
Of course, a closer analysis reveals that, in fact, information about discon-
nections and re-connections has to pass from ring to ring, and can't propagate 
faster than the speed of sound in the rings.

If you were to simulate this process, the speed of sound would correspond to how 
fast you could walk the chain of dependencies *backwards*.  However, if chains 
are indeed almost always short, this might be practical.

The big potential advantage of this approach is that it's inherently paralleliz- 
able.

Even assuming a limited dependency length, the vague algorithm as I sketched it 
probably wouldn't be too useful.  But perhaps there are variations that would 
be.

Does anyone have figures on typical dependency chain lengths?

							-- Jerry