[gclist] Formal semantics of stack maps

Marc Feeley feeley@IRO.UMontreal.CA
Thu, 2 Dec 1999 18:03:02 -0500


> I'm not so sure, anyones made the observation before, but I know several
> precise collectors emit a mapping from the current return address of 
> stack frame to a map that describes the types and roots of the stack.
> 
> After playing with some standard transformations used in compilers, I've
> "discovered" that the stack map is something that comes for free if you CPS
> convert your program and then convert your CPS program into a first order
> (i.e. programs with out function pointers) language which uses a switch
> statement on tagged values to simulate jumps/computed gotos. 
> 
> Anyway, has anyone else made this or a similar observation before? 

Yes, this is basically how the Gambit-C Scheme to C compiler works.
Each control point (i.e. a point in the Scheme code that can be jumped
to, such as a procedure entry or a continuation/return address) is
represented as a small structure which contains:

0) a header (needed because procedures are first-class data in Scheme)
1) a pointer to the C function that contains the code for that control point
   (this is needed to support separate compilation, i.e. one or more C
   function per compilation unit)
2) arity information (for procedures) or a GC map (for continuations)
3) when "gcc" is used, the address of the label within the C function
   corresponding to this control point (in this case a computed
   goto in the C function is used instead of a "switch")

The trickiest thing is to represent the GC map compactly.  If there
are fewer than 20 frame slots then a single 32 bit word encodes
the number of slots, the location of the return address (needed
to implement first-class continuations) and a bitmap of the slots
that contain Scheme objects that the GC must trace.  Otherwise
the word contains a pointer to a variable length bitmap.

Marc