[virtmach] bytecodes and stacks

Ian Piumarta Ian.Piumarta@inria.fr
Mon, 29 Nov 1999 12:38:12 +0100 (MET)

> > >The questions are:
> > >Is it possible to just push/pop things off one stack, allowing for things
> > >like global-variables and locally bound ones (eg. let in scheme)? or would
> > >several be needed, leading to a towers of hanoi like situation, when you
> > >needed  to reference a variable form an earlier "stack frame", not a fun
> > >soulution.
> > 
> > I'm in the early stages of trying something similar (a VM to execute
> > byte-codes generated from scheme code).  I discovered that I couldn't
> > simply push and pop things off a frame stack because of the existence of
> > environment closures.
> ISTR reading in the SML implementation literature (of which there is a
> *lot*) that it is in fact true that a stack is enough, even in the
> presence of first-class & higher-order functions, although you have to
> be careful about how you form closures.
> Now that I think about it, that might apply more properly to ML than
> Scheme, since ML values are not mutable (names are bound to actual
> values), while Scheme names are bound to value references (because of
> the presence of set!). You should be able to model the Scheme approach
> in an ML-style system, but I imagine that it wouldn't be so pleasant.

One stack is always enough.  The only "trick" is to avoid storing
non-lifo environment (e.g. closed-over free variables) in the h/w
stack.  Do the analysis to find the free variables (it's not hard) and
then arrange for them to be stored in, and accessed through, some kind
of separate "state vector" object allocated in the heap.  A reference
to this vector is then shared between all the closures that might
outlive their defining context.  By doing this you'll split
(potentially non-LIFO) free environment from the (strictly LIFO)
execution state in the stack.  Leave the GC to clean up the state
vectors, as and when the last closure that references them becomes
unreachable.  (Typical programs don't refer to non-global free state
all that often, so the overhead of allocating the additional state
vector is negligible.  The cost of indirecting to it is exactly the
same as the cost of indirecting to a LIFO storage cell somewhere in
the stack.)

(FWIW: the only time that managing a strictly LIFO stack becomes
"tricky" is if your language allows the programmer to explicitly
modify the chain of activations on the stack [which is the case in
Smalltalk, for example, whenever a new context is stored into the
"sender" field of some other context].  In such situations there is no
other solution: you have to reconstruct the entire stack under the
feet of the running program, taking appropriate measures to ensure
that your reconstruction code is not executing in the same region of
stack as the stack you're trying to reconstruct.  But this problem is
related to the programmer being able to fiddle with the execution
state held in the stack, and not to the existence of closures and free

Hope this helps.

Ian (who spent several hours this weekend writing code to reconstruct
     stacks under the feet of running programs. ;-)

------------------------------- projet VVM -------------------------------
Ian Piumarta, INRIA Rocquencourt,          Web: www-sor.inria.fr/~piumarta
BP105, 78153 Le Chesnay Cedex, FRANCE             Voice: +33 1 39 63 52 87
--------------------- Machines Virtuelles Virtuelles ---------------------