Lisp VM and OS implementation

Fare Rideau rideau@ens.fr
Sat, 26 Apr 1997 13:35:37 +0200 (MET DST)


[Beware: this article is cross-posted
to lispos@math.gatech.edu and tunes-lll@ens.fr;
use private e-mail for suggesting me how I should do or not do]


There's been much debate lately on lispos@math.gatech.edu
about building a standard VM as part of their implementation of lispOS;
people on the tunes-lll@ens.fr mailinglist also have long considered
such a VM as part of the implementation of the Tunes project.
   As has been pointed out,
building a standard VM is a very worthwhile project in itself,
as it is applicable to a large range of other projects,
including replacing the braindead JVM,
and allowing for interoperability with commercial software.

Now, again, what are the goals and constraints of a VM?
Isn't the Lisp s-exp syntax already a VM in itself?
Building a VM for the sake of itself isn't a goal!
['hope you ain't building a VM just to comply with the Java-triggered fad]
   Do you want (A) a standard compact to seamlessly exchange code
written in arbitrary languages over an arbitrarily heterogeneous network?
i.e. some kind kind of UNCOL [UNiversal Compiler Object Language?]
   Or do you just want (B) a convenient HAL [Hardware Abstraction Layer]
to build a generic implementation for some family of similar architectures
(e.g. sequential machines with a flat 32 bit word-aligned address space)?

In the Tunes project, my attitude has been that the two above goals,
UNCOL and HAL, are quite *distinct* goals
(the distinct utilities of which I have no space to discuss here),
that require quite distinct solutions.
   If VMs are to be used in both cases, that could mean distinct VMs:
there would be one "VM" used and tuned for interoperability,
that would be a recognized standard to transport code networkwize,
and other VMs used an tuned for easily co-implementing
a system/language on a variety of similar hardware architectures.


As for a *portable* VM, why require more than
a standard fast tokenized parsed packaged representation of Lisp code?

The clear advantages are that it can be a compact portable format
with which you can keep as *high-level* information as needed.
   This allows for dynamic compilation
for parallel hardware or cheap/fast hardware to come
instead of forcing people to use the current expensive/slow
stubborn RISCy emulation of the braindead von Neuman architecture.
   For instance, I imagine a lispy OS running on blazing fast unexpensive
straightforward clusters of MuP21 or MuP32 MISC chips with onchip RAM.
A cluster of MuP21 could inexpensively blow the symbol-crunching power
of the most bloated RISC CPU, if only it wasn't required to emulate
a sequencial computer because of a stupid low-level VM standard.

Such an approach does NOT preclude things like
distribution of optimized or black-box code:
   With indeterminate/alternation operators,
we could even propose several versions of a same routine,
between which the executive could choose the fittest;
this includes CPU-specific versions, or ones using some lower-level VM.
   More genericly, we could allow the portable code
to contain conditional hints about possible optimizations to try
by the cpu-specific back-end compiler.


As for the a low-level VM for generic implementation,
why not make it purely implementation specific,
and refuse any dependance of the OS on it?

This means we can have VMs arbitrarily adapted
to our implementational needs:
   we could choosing arbitrary compromises
between efficiency and portability,
allowing for conditional variations to efficiently accomodate
various flavors of hardware architectures (16, 20, 32, 64, 128 bit CPU);
   we could seamlessly upgrade, modify, fine-tune the VM for speed,
or replace it by native-code stuff, in an implementation-dependent way,
without being burdened by a growing lot of legacy software to support.

By tagging binaries with the (virtual/real)machine version identifier,
we would allow the system to dynamically manage multiple VMs:
   communicating executives could negociate
a proper VM in which to send code for optimized behavior,
while the existence of a portable *high-level* VM as above
ensures that even alien architectures will have a way to communicate
in a relatively efficient way.
   Just discard code from unrecognized VM,
and recompile from the most specific available code,
possibly the portable version.
VM/native binaries would be just cached versions
of precompiled portable code, for faster startup/execution.


I guess my plee is that there is no need to make
braindead compromising mixes of the two VM approaches
that can only burden us further with legacy code,
when we can have the best of both worlds, by keeping them well separate.


Regards from France,

== Fare' -- rideau@ens.fr -- Franc,ois-Rene' Rideau -- DDa(.ng-Vu~ Ba^n ==
Join the TUNES project for a computing system based on computing freedom !
                TUNES is a Useful, Not Expedient System
URL: "http://www.eleves.ens.fr:8080/home/rideau/Tunes/"