[gclist] Finalization and object orientation.

Guillermo (Bill) J. Rozas gjr@hplgr2.hpl.hp.com
Tue, 1 Apr 1997 18:30:17 -0800 (PST)


|   Date: Tue, 1 Apr 1997 18:48:09 -0500 (EST)
|   From: Zhong Shao <shao-zhong@CS.YALE.EDU>
|   Subject: Re: [gclist] Finalization and object orientation.
|   
|   This so-called "safe-for-space-complexity" definition of reachability
|   has been implemented in the SML/NJ compiler for almost three years 
|   (see [1] for details). Many people view it as an *optimization* to 
|   eliminate certain kinds of space leaks, we often think it as 
|   a crucial property that any industrial-strength compilers (for functional
|   languages) must satisfy. 
|   
|   Consider the following ML function (simulating N phases of a typical 
|   compiler):
|   
|         fun compile (src) =
|           let val codeInIL1 = phase1(src)
|               val codeInIL2 = phase2(codeInIL1)
|               ......
|               val codeInILN = phaseN(codeInILNm1)
|   
|            in codeInILN
|           end
|   
|   If one writes the above code in C, he/she would definitely do an explicit 
|   "free" of each intermediate data structure after each phase is done. In 
|   most garbage-collected languages, the programmer don't have this kind 
|   of options; so he would think that the compiler is going to take care of 
|   that.  

But there are other options.

Option 1: Use tail-recursion to get rid of intermediates.  This is
essentially what a CPS compiler does for you, but automatically.
However, you have the choice to do it or not.

Option 2: Explicitly clear the variables when appropriate.  In ML you
would have to bind the variables to cells (I think you guys call them
"refs", but I don't recall).  Again, the choice is left to the programmer.

The problem that I see with _requiring_ this behavior (I've
implemented compilers that provide it and others that do not) is that
it has too much of the flavor of "semantics defined by the art of
compiler optimization".

I'd much rather have a simpler notion that even a novice could follow
and understand as the requirement.  A compiler could then improve on
it as an optimization.  For an example of the simple model that I
think is more appropriate, see the environment model chapter (chapter
3/4?) of "Structure and Interpretation of Computer Programs" by
Abelson, Sussman, with Sussman.

|   What we concluded is the following: for languages that make pervasive uses
|   of "functions" and/or "closures", a new scoping rule must be used. Each 
|   local variable should be considered "dead" after its *last* use in the 
|   current function body. By "dead", I really mean "not contributing to the 
|   liveness of the data structure it points to".

What is the last use of y in the following?

	(if (= (- x x) 0)
	    y
	    (* z 3))

what if some other compiler can prove that x is a normal IEEE
floating-point number (as opposed to an infinity or NaN)?

How far do you carry this?  How can the programmer _predict_ without
examining the output code or doing experiments?  Or even worse, how
can a programmer who wants to keep the stuff around for debugging do
it without going through contortions?

A simpler model (where you don't have to CPS-convert the whole program
in your head to figure it out) is easier for the programmer to handle
both to determine what is happening when s/he loses, and to subvert
whenever s/he wants to do that.