[gclist] Finalization and object orientation.
Zhong Shao
shao-zhong@CS.YALE.EDU
Tue, 1 Apr 1997 18:48:09 -0500 (EST)
|>> > My impression from Andrew Appel's group is that maintaining
|>> > asymptotic space complexity (with the CPS definition of reachability)
|>> > is neither free nor intolerably expensive. It does inhibit some
|>> > optimizations, and make it much harder to get the compiler right.
|>>
|>> That makes sence to me. Any references?
|>
|> I think his book says a little about that. My information comes mainly from
|> Zhong Shao.
|>
|> Hans
This so-called "safe-for-space-complexity" definition of reachability
has been implemented in the SML/NJ compiler for almost three years
(see [1] for details). Many people view it as an *optimization* to
eliminate certain kinds of space leaks, we often think it as
a crucial property that any industrial-strength compilers (for functional
languages) must satisfy.
Consider the following ML function (simulating N phases of a typical
compiler):
fun compile (src) =
let val codeInIL1 = phase1(src)
val codeInIL2 = phase2(codeInIL1)
......
val codeInILN = phaseN(codeInILNm1)
in codeInILN
end
If one writes the above code in C, he/she would definitely do an explicit
"free" of each intermediate data structure after each phase is done. In
most garbage-collected languages, the programmer don't have this kind
of options; so he would think that the compiler is going to take care of
that.
Unfortunately, many compilers treat "compile" just as a C-like simple
subroutine. All the local variables (assuming they are all pointers to
some heap-allocated objects): src, codeInIL1, codeInIL2, ..., are
allocated in the same stack frame, and they are all considered to be
*live* until the end of the phaseN (when we pop off the stack frame for
"compile"). This is unacceptable, and the older versions of the SML/NJ
compiler have run into many problems like this.
What we concluded is the following: for languages that make pervasive uses
of "functions" and/or "closures", a new scoping rule must be used. Each
local variable should be considered "dead" after its *last* use in the
current function body. By "dead", I really mean "not contributing to the
liveness of the data structure it points to".
Supporting this new kind of "liveness" definition is very important,
because functions like the "compile" are quite common, especially in
large software.
This new scoping rule is less crucial for C-like languages because
large data structures are often explicitly allocated and de-allocated,
so the programmer takes over the responsibility of "free"-ing these
things.
-Zhong Shao
(shao-zhong@cs.yale.edu)
PS. References:
[1] Z. Shao and A. Appel. "Space-Efficient Closure Representations"
in 1994 ACM Conference on Lisp and Functional Programming, also
available at the following website:
http://www.cs.yale.edu/users/shao-zhong/papers.html
The corresponding chapter in my Ph.D. thesis gives more a detailed
explanation.