[gclist] Guardians

Fergus Henderson fjh@cs.mu.oz.au
Sat, 12 Apr 1997 05:02:11 +1000 (EST)

Carl Bruggeman, you wrote:
> Fergus Henderson wrote:
> > Carl Bruggeman wrote:
> > > 
> > > I believe that this confirms that topological ordering is not always
> > > the correct order.  You explicitly clear pointers to change the order
> > > of finalization (requiring global knowledge of the dependencies on the
> > > part of the programmer to do so), while we advocate using multiple
> > > guardian queues to handle finalization dependencies. 
> > 
> > Explicitly using weak pointers or "weak-for-finalization" pointers
> > (pointers which are ignored when determining finalization order) does
> > not require global knowledge of the dependencies.  To declare a
> > particular pointer which is a member of type A as "weak-for-finalization",
> > all you need to know is that the finalizer for type A will still work
>   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > if that pointer has been cleared because the object it points to has
>   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > already been finalized.  This can be determined by looking at the code
>   ^^^^^^^^^^^^^^^^^^^^^^
> > for A's finalizer.
> This is precisely my point.  The programmer must know that the object
> pointed at by the weak-for-finalization pointer, __and__ all the
> objects pointed at by that object, and all the objects pointed at by
> those objects, etc.  don't depend on A.

No, that's backwards.  If those objects depend on A, no problem!

The only time you get a problem from declaring a particular pointer
which is a member of type A as "weak-for-finalization" is if A's
finalizer dereferences that pointer.  And you should be able to
determine whether or not that is the case by looking at just the code
for A's finalizer and the interface documentation for the routines it
calls.  You shouldn't need to look at the implementation of the
routines it calls.

> For simple cases (probably
> the vast majority), only the code for A needs to be inspected.  For
> more complex cases (such as a cycle of dependencies or chains where
> the finalizer code may be more complex) the programmer must bring
> global semantic knowledge to the problem.
> All I am arguing is that _in some cases_ the programmer must have
> global knowledge of the program's finalization dependencies to know
> which pointers must be weak-for-finalization,

You're talking about a different problem to the one I was talking
about above.  Here you are talking about what knowledge is required
to decide which pointers _must_ be weak-for-finalization.  I am
talking about what knowledge is required to decide which pointers
_can_ be weak-for-finalization. 

These two properties could be called completeness and soundness respectively.
Local soundness: can I safely declare this pointer as weak-for-finalization?
Completeness: have I declared enough pointers as weak-for-finalization
to ensure that the topological dependencies amoung finalizers are acyclic?

Proving local soundness requires only local knowledge.
Proving completeness requires global knowledge.

If you are careful to make pointers weak-for-finalization whenever
possible, then the only time you will have any problems is if the
finalization dependencies really are cyclic, at least when considered
at the granularity of whole objects.  In that case, you're
going to have to redesign your application to avoid the cycle,
for example by splitting an object into two parts which can be finalized

But that case is very rare.  Pulling some numbers out of a hat, in
probably >99% of code you don't need finalizers, in probably >90% of
code needing finalizers you don't have any cycles in the topological
ordering, and in probably >90% of code with finalizer cycles in the
topological ordering the cycles can be broken by making pointers
weak-for-finalization whenever possible.  Also as Hans noted, the
failure mode is relatively benign -- you get a resource leak and
a warning.

> When finalization
> dependencies are simple enough, both approaches have simple interfaces
> that requires little (guardians) or no (topological) work.  When the
> programmer is in error bringing global knowledge to the problem, such
> as breaking a cycle or chain in the wrong place with a weak pointer,
> memory can be deallocated incorrectly (or other "dangerous" things can
> occur) using either approach.

No, that's incorrect.  With the topological approach, if a programmer
is incorrect in applying global knowledge, the only thing that can
happen is failure to call finalizers and if desired a warning to the user
that the GC detected a finalizer cycle.

The other dangerous thing that can happen with the topological approach
is that a finalizer might depend on a pointer that was declared as
weak-for-finalization, and the object it points to may have already
been finalized, so you might get a dangling pointer or null pointer
dereference.  However, that can happen only if the programmer is
incorrect in applying _local_ knowledge.

> I think guardians provide a much better approach when more control is needed

When is more control needed?  Can you give some convincing examples of why
we would need this extra control?

I had a look at the Boehm collector's interface for weak pointers,
namely `GC_general_register_disappearing_link(&ptr, &obj)', which
sets `ptr' to null when `obj' is known to be unreachable.  This is not
quite what I had in mind for "weak-for-finalization" pointers,
which I defined as "(pointers which are ignored when determining
finalization order)".  Weak-for-finalization pointers in an object
could be cleared just before calling the object's finalizer, but they
should not be cleared any earlier.  If the collector clears such
pointers all at once after a GC before invoking any finalizers, that
could cause problems.  It would mean that if you are finalizing say A,
which has a (strong) pointer to B, which has a (strong) pointer to C,
then the weak-for-finalizations pointers in object C may be cleared
before A's finalizer is executed.  That is a problem because A's
finalizer may call a method in B which calls a (non-finalizer) method
in C which depends on C's weak-for-finalization pointers.

So I guess `GC_general_register_disappearing_link()' can only be used
for weak pointers, not for "weak-for-finalization" pointers.  Oh well.

Fergus Henderson <fjh@cs.mu.oz.au>   |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>   |  of excellence is a lethal habit"
PGP: finger fjh@         |     -- the last words of T. S. Garp.