[gclist] RE: gclist-digest V2 #25

Greg Morrisett jgm@exchange.CS.Cornell.edu
Mon, 31 Mar 1997 18:11:39 -0500


>> > This issue was discussed in Mark Weiser's and my SP&E '88 paper.  Is x
>>seen
>> > by the garbage collector during the call to g in
>>
>> > { T x; f(x); g("abc"); }
>>
>> > For a naive compiler the answer is "yes".  For a CPS-based compiler, the
>answer
>> > is "no".  For a typical compiler with global register allocation the
>>answer
>is
>> > "maybe".

Appel's definition of "space safety" has little to do with
a CPS-based compiler.  Rather, it has to do with a closure
environment strategy that strengthens the closure environment
as much as possible (i.e., only retains values needed to cover
the free variables of the code).  See [1] for a discussion.
CPS conversion just makes this easy to do, but comes with its
own costs.
>
>> > There has been a significant amount of work on this kind of issue,
>especially
>> > in the context of ML and logic programming languages.  (E.g. there was a
>paper
>> > a few years back claiming significant space improvements in ML by
>>treating
>> > variables and data structure components with unconstrained type as dead.
> Since
>> > their type is unknown, they can'e be used for anything.  They can be
>>safely
>> > reclaimed, in spite of the fact that this leaves "dangling" pointers.
> Appel is
>> > careful to define a CPS-based notion of reachability for SML of NJ in
>>order
>to
>> > avoid some problems they observed with early versions of their system.)
>>
>>
>> Are you referring to
>> 	Benjamin Goldberg: Tag-Free Garbage Collection for
>> 	Strongly Typed Programming Languages. 165-176, PLDI 91
>> (right next to your Mostly Parallel paper - shades of Princes Bride :))?
>>
>> Seems a little different, but with the Tag-free scheme in the above,
>> ignoring the non-termination case, reachability is decidable from all
>> call-points in the program, no?
>>
>
>Unfortunately, I don't have my Proceedings handy.  I think I'm referring to a
>later paper that expanded on this idea.  If nobody else volunteers the
>reference, I'll track it down later.
>
>Effectively this scheme uses yet another definition of reachability, which is
>an approximation of the one in the Java definition.  The type-based notion of
>reachability is decidable.  But I don't think it's something you want to put
>in
>a language definition.
>
>See also [2] and in fact the earlier work of Henry Baker [4]
>and also Appel [3].  Unfortunately, there has been little
>experimental work to validate the type-based (really unification-
>based) approach first suggested by Goldberg and Gloger.  
>Never mind the fact that it seems impossible to do the
>unification in a reasonable amount of space and time, but
>the amount of additional garbage collected over a traditional
>transitive graph closure algorithm (e.g., mark/sweep, copying)
>seems small at best.  The example is, you've called the
>polymorphic "length" function on a list whose element type
>is unconstrained.  So, in principle, you only have to maintain
>the spine of the list and can garbage collect the actual
>elements.  However, the whole list is very likely to become
>garbage as soon as the length operation is complete, given
>that it's essentially the only operation you can perform on
>it (else the type will be constrained). 
>
>I've been thinking that for a long time, we should be measuring
>the difference between semantic garbage and actually collected
>garbage before we start investigating fancy-schmancy techniques
>like this.  (Much like the measurements on theoretical instruction
>parallelism that Wall and others did in the 1980's.)
>
>For instance, one could put a time stamp on each
>object as it's allocated, and update the time stamp each time
>the object is touched (a read/write barrier of sorts).  After
>every step in the computation, perform a full garbage collection
>and for each object that is freed at a given time step, record
>the time delta between the current time and the last time the
>object was accessed.  (Bonus points for speeding up the simulation
>by only periodically doing a GC).  If on average, the delta
>is pretty small, then we shouldn't be wasting our time looking
>at algorithms that try to collect more semantic garbage.  
>
>Of course, as Hans points out, one must begin by defining
>garbage formally with respect to a suitable operational semantics
>that fully reflects the implementation concerns.
>Again, I humbly refer the interested reader to a recent paper by 
>Bob Harper and myself on this very issue.[1]  Comments are
>welcome.
>
>-Greg Morrisett
> jgm@cs.cornell.edu
>
>[1] Sematics of Memory Management for Polymorphic Languages,
>    Greg Morrisett and Robert Harper 
>    To appear in HOOTS.  Preliminary version published as CMU
>    technical report CMU-CS-FOX-96-04.
>    ftp://reports.adm.cs.cmu.edu/usr/anon/1996/CMU-CS-96-176.ps
>
>[2] Abstract Models of Memory Management.
>    Greg Morrisett, Matthias Felleisen, and Robert Harper.
>    In Proceedings of the ACM Conf. on Functional Programming Languages 
>    and Computer Architecture, pages 66-77, 1995.
>    
>[3] Runtime tags aren't necessary.
>    A.W. Appel.  Journal of Lisp and Symbolic Computation, 2:153-162,
>    1989.
>
>[4] Unify and conquer (garbage, updating, aliasing, ...)
>    Henry Baker.  in Proceedings of the 1990 ACM Conference on Lisp and
>    Functional Programming, pages 218--226, 1990.