[gclist] More advice required

Stephen Biggs stephen.biggs@merant.com
Thu, 10 Jun 1999 18:10:04 +0100


Hi,
Having taken advice from people on this list and elsewhere, we have embarked
upon investigation into a mark-sweep collector. It's looking reasonably
promising. Many thanks for that advice!

However, we have run up against a (so far theoretical) problem.

The definition of OO COBOL as it stands does not allow for destructors (as
in C++ etc.) but does allow for standard resource reclamation methods that
the garbage colllector can call - "finalizers".

The problem becomes apparent in the sweep phase. A number of objects have
been identifed as "dead" and can therefore be removed from the system.
No order for the collections is specified.
The finalizers for all dead objects are expected to be called.
If, then, one dead object still refers to another, it can invoke any method
on that second dead object during its finalizer.
This is OK, we can deal with this using one of a number of techniques, so
that the first object is finalized first, and therefore the second object is
still valid during the finalization phase. (The same applies if the same
object is referred to by >1 other dead object.)

But, the question then occurs, what happens if you have a network, or
recursive referencing?
That is, the two dead objects both reference each other in memory - either
directly or indirectly - and therefore the garbage collector has no way of
determining which object must be finalized first. If it gets the wrong one,
the other may try to invoke the (now) finalized object!

NB: We can, and intend, to keep the object as valid to be invoked, ie not
actually destroyed, until all finalizers have been called for all dead
objects. This eases the situation in that methods can still be used on each
object, but, having been finalized, the results might not be what the object
invoking it expects, or possibly the object itself expects! That is, the
resources that the object held have now been destroyed in the finalizer, and
this *may* mean that it has a state unforeseen by the programmer.

Of course, this situation may be written off as "application design error",
but there may be heuristics or good algorithms that we haven't discovered
that may ease the problem.

For instance, it is noticibly better if "tree-top" objects (objects that
aren't referred to by any other object, but may refer to others) are
finalized first?
(We think this is likely, but it does potentially slow down the collector to
find these objects, finalize them and then repeat.)
Is it noticibly preferable to a "find first, recurse sweep, find next"
algorithm? Are there any hidden dangers to this approach?

Does anyone know where I can look for any proven (partial?) solutions to
this problem?

Any thoughts would be appreciated!

Many thanks,
Steve.