[gclist] Finalization and object orientation.

Giuliano Carlini GCARLINI@us.oracle.com
31 Mar 97 11:05:35 -0800


--=_ORCL_34077062_0_11919703311219360
Content-Transfer-Encoding:7bit
Content-Type:text/plain; charset="US-ASCII"

I hope we can calm down the tone of this discussion. We all
have good points.

I've got a paper at home that has a real nice approach to
finalization using "guardians". I'll have to look up the
citation tonight. It may succumb to the billion dollars in
a billion years flaw. I'll need to reread it. I'll give you
the citation tommorrow.

I've used topological sorts in the past without problems.
Most likely because the number of nodes and arcs were
relatively small, and usually without cycles. You may very
well be correct that a topological sort of a large cyclic
graph would be impractical.

But, I don't think we need to handle cycles. Finalization
order usually indicates dependence. If A is dependent on B,
then B must finalize before A. A and B can't be both dependent
upon one another, or there is no possible way to finalize
them. I therefore believe that a dependence graph is
necessarily a DAG.

Now, I believe that a topological sort of a DAG is cheap.
With back pointers it is O(n). Even if you choose not to use
backpointers it will usually be O(n) since there is usually just
a single dependency arc into a node. And spending closer to
O(n^2) time for the rare program with an intricate dependency
graph is not unreasonable.

Now, this is just a 30 second analysis, so I may very well be
wrong. If I am, I hope you'll let me know. Actually, I'm quite
sure you will ;->

But, using integers to represent order fails. Say I have objects
A with priority 100, and B with priority 101. Now I create
C which I wish to place in between A and B. The Borland compiler
uses such a scheme. And this has happened to me.

I never suggested using the reference graph to determine the
dependence graph. Explicit declaration of mutator semantics
is the best course in my opinion.

David probably thought your position on timeliness extreme
for the same reason I did: Your retort to my question asking when
does timeliness matter. I was no more meant staking out an extreme
position than you are. I do expect a program to complete before a
billion years. Heck, most of the time I expect them to take less
than a century ;->

Mostly, I like what you have to say about finalization and agree
with it. The fact that we disagree on a few points does not
make you a moron. I don't write to you as if you are. I'd
appreciate the same treatment in return.

Regards,

g



--=_ORCL_34077062_0_11919703311219360
Content-Type:message/rfc822

Date: 31 Mar 97 08:23:37
From:"Charles Fiterman <cef@geodesic.com>" <majordom@iecc.com>
To:gclist@iecc.com
Subject:Re: [gclist] Finalization and object orientation.
Reply-to:gclist@iecc.com
Return-Path:<majordom@iecc.com>
Sender:majordom@iecc.com (MajorDomo)
X-Sender:mail38454@alterdial.uu.net
X-Mailer:Windows Eudora Pro Version 2.2 (32)
Precedence: bulk
Errors-to:owner-gclist@iecc.com
MIME-Version: 1.0
Content-Transfer-Encoding:7bit
Content-Type:text/plain; charset="us-ascii"

At 09:48 AM 3/31/97 -0500, you wrote:
>At 07:21 AM 3/31/97 -0600, Charles Fiterman wrote:
>>Great tell me about this method topoSort() does it detect cycles. Where
>>do I get source.
>
>Well, this explains many things.  The most irritating thing about
>your proposals for finalization is that you seem unconcerned by the
>problems pointed out by intelligent, experienced people, as if they
>were merely a bunch of lazy whiners.  topoSort is topological sort,
>and yes, it detects cycles.  Any introductory computer science text
>on algorithms ought to explain it.

I found it there. The overhead was extreme. This is why I haven't
found any collectors that actually do that. Further I've concluded
that finalizer order cannot be deduced from pointers its purely in
the semantics of the mutator. If a buffer object points to a
file object it seems natural to finalize the file object first.
But there is no reason to assume the exception for writing to a
finalized file object is less valid than writing to the file object.

In the case where a list of finalizer objects each writes to a
report only knowledge of the desired order of the report determines
the correct order of the finalizers.

So the topological sort simply allows me to do something useless
in a totally inefficient way.

>There's big problems with your position that finalization ought
>to be certain, timely, and "simply another method".  Certain and
>timely finalization also requires certain and timely collection
>of all garbage. 

By timely I mean not insanely untimely. I keep saying that. By insanely
untimely I mean it can't make finalization impossible in the real world. 

>This is difficult to guarantee; it is essentially
>impossible without a cooperating compiler.  Your proposal that
>finalization is also "just another state change" also conflicts
>somewhat with these requirements, because this (to me) seems to 
>conflict with the goal of timely finalization. The reason of this
>conflict is the possibility of "resurrection" -- if a finalizer
>places an object, or a referenced object, within some data structure
>referenceable from live memory, it potentially converts almost-garbage
>into non-garbage.  If finalizer are run in random order (good for
>timely finalization) then the resurrected object may or may not have
>been finalized.  Previously, only non-finalized objects could have
>been seen by the mutator, but now it can see finalized objects
>and non-finalized objects.  This also breaks encapsulation, because
>if I use a Foo in my data structure, if I choose to resurrect that
>Foo, I had better know whether a finalized Foo can be used for anything
>at all.

Safe finalization implies the existence of post finalized objects
visible by the mutator, assuming finalizers are run at all. If finalizers
get run a finalizer can resurrect its object by saving a pointer to it.
The problem is entirely the result of a safety requirement combined
with running any finalizers at all.

Ordering finalizers only adds the cost of a sort assuming you have
a finalizer order number. The efficiency cost isn't in the sort its
in the safty requirment which puts a write barrier on pointers or
requires that the mark phase be rerun after finalizers are run.

A mark and sweep alternative is.

do {
  run mark phase
  if (unmarked finalizer objects) {
     sort unmarked finalizers
     run finalizers
     continue
  }
  break
}

Note if finalizers clear root pointers this can loop. This might be 
prevented from looping by some pass counting logic. The write barrier
approach never loops.

>The alternative I see is to decide that you have just carried a good
>idea to a ridiculous extreme -- finalizers ought to be employed for
>a FINAL state change, finalized objects ought to be in some sense second
>class, and finalizers should not resurrect garbage.  This is another
>set of consistent positions, that actually make it more possible to
>provide timely finalization (run the finalizers in any order, don't
>worry about resurrection), and that is consistent with how finalizers
>are specified in Java (the fact that they can be run automatically only
>once is a strong clue that you ought not resurrect objects, because
>the resurrected objects will be in a weird state.  Just because the
>language definition specifies how a corner case will behave, does not
>mean that it is a good idea to program using that corner case).

An easy alternative is to say that finalizer order is undefined. But
we can't stop finalizers from resurecting objects without having pointers
to freed storage or not running finalizers.
			-  
Charles Fiterman		Geodesic Systems
414 North Orleans Suite 410	Phone 312 832 1221 x223
Chicago IL 60610-4418		FAX   312 832 1230
				http://www.geodesic.com

A computer language without garbage collection
  is like a city without garbage collection.

--=_ORCL_34077062_0_11919703311219360--