[gclist] Length of dependency chains
Jerry Leichter
leichter@smarts.com
Thu, 17 Apr 1997 09:29:05 -0400
In his discussion on guardians, Hans-Juergen Boehm made the following remark
about the length (N) of dependency chains seen by the collector:
| My argument is that based on my experience, admittedly with a
| different collector, this has never mattered. I suspect the reason is that
| N was never larger than 2 or so, at least after proper treatment of pointers
| not needed for finalization. (I would guess that actual path lengths can
| be significantly longer. But N here is the number of finalizable objects
| in the chain, not the length of the chain. And finalizable objects are
| usually rare.)
This brings up an interesting point: Just how long *are* typical dependency
chains? That is: What is the expected length of the shortest path from a root
to a live element? I suspect that, for most programming styles, not only is the
expected value quite small, but that the distribution is heavily skewed toward
small values, with outliers being quite rare. (One can certainly construct
arbitrary long linear chains; and if you use heap allocation for closures, you
get a chain as long as the dynamic call chain - with a stack, the whole stack is
in the root set.)
The reason this might be interesting is that one might conceivably be able to
use the assumption of short paths in a GC algorithm. Consider the following
analogy, which I've been thinking about for years but have never been able to
translate into a practical algorithm: You've got a room with a large bunch of
linking rings on the floor (the free space). On the ceiling are a bunch of
hooks (the root set). You can hang rings off the hooks, or off each other. As
long as there is a path from a given ring to a hook, it remains suspended in the
air. As soon as the last path is broken, the ring falls to the floor (i.e., is
freed).
If you do this with real rings, ignoring tangles among chains (e.g., get a
supply of that famous frictionless lubricant used in constructing examples for
elementary physics text), "dead" rings begin to fall to the floor "immediately".
Of course, a closer analysis reveals that, in fact, information about discon-
nections and re-connections has to pass from ring to ring, and can't propagate
faster than the speed of sound in the rings.
If you were to simulate this process, the speed of sound would correspond to how
fast you could walk the chain of dependencies *backwards*. However, if chains
are indeed almost always short, this might be practical.
The big potential advantage of this approach is that it's inherently paralleliz-
able.
Even assuming a limited dependency length, the vague algorithm as I sketched it
probably wouldn't be too useful. But perhaps there are variations that would
be.
Does anyone have figures on typical dependency chain lengths?
-- Jerry