From M4 to Scheme (was Re: M4)

Patrick Premont premont@cs.toronto.edu
Sun, 28 Jan 1996 07:02:54 -0500


> 
> > How will our Scheme implementation relate to the LLL and the HLL ?
>    After some thought, I think that we should do in Scheme what
> I ended up trying to do in m4 (with lot of pain):
> writing an interpreter for a subset of some representation for the HLL,
> in which a compiler for the LLL will be written, as well as a compiler
> for an extension of this representation as implemented over the LLL.
> 
> That is:
> * we take a semantic subset of the HLL
> * we define a approximate representation of it into Scheme

Do you mean that we define this subset of the HLL (HLL-) in such a way
that it is a superset of Scheme (Scheme+ = HLL-), so that the HLL-
implementation is also an implementation for Scheme and that we can
use it to interpret the code we've started writing in Scheme ?

Do you really want a HLL- interpreter in Scheme ? That means writing a
full Scheme interpreter plus the extensions for the HLL-. Or do you want
to extend Scheme without an interpretation layer by, for exmeple, adding
an simple object system implemented with a few standard Scheme functions ?
(Like I've done in the Scheme code I've writen to relace the m4 code.)

> * we program a LLL compiler using the subset,
>   that we code using this representation
> * we gradually extend the LLL compiler
>   so as to implement the subset of the HLL,
>   plus essential Tunes feature not found in Scheme.
> * we finally leap into a bootstrapped system,
>   as the compiled system is satisfactorily powerful

So the LLL compiler is just an incomplete version of the HLL compiler
?  But what I've started writing in Scheme is a compiler. There is no
parser because there is no input program to compile. There is only a
collection of compiling functions which reduce high-level constucts to
low level ones. Look at my if-then-else function for exemple which
compiles an if-then-else into assembler code. You could say that
if-then-else is a construct of the LLL. That compiling function
can be augmented with cases for other platforms so that call to
if-then-else become portable. So when we are writing the Scheme or
HLL compiler in Scheme we are also in a sense writing it in the
LLL if we only go through such portable functions. This kind of
ambiguity could explain why I have difficulty understanding exactly
what you propose and how you inted it to differ from my proposition.
But you express concerns about my plan futher down and I think I
can reassure you about these points.

> > Remember I'm the one which was wondering why we needed a LLL.
> > I'm not exactly clear on what we'll do with that LLL.
>    As I said previously, all that relates to migration require a new
> LLL to be defined, which can't be made *efficiently* on top of a HLL
> (expect several orders of magnitude of inefficiency loss for a Scheme-based
> implementation over raw assembly).

I assume all these comments were made having read my suggestion (which
comes further down). I suggest we write in Scheme but what we are writing
is a compiler which produces (eventually) machine code. So there is no
reason to be concerned about the performance of the implementaiton
of the following features :

>    For instance, real-time, open-implemented GC, distribution, persistency
> all require low-level protocols and low-level memory management. A LLL is
> required.

As I pointed out earlier, a subset of the compiling functions that will
be present in the initial Scheme compiler writen in Scheme that my
suggestion calls for will consitute the LLL. We are writing in Scheme
but we are not simulating distribution, GC or presistency using Scheme, we
are compiling these features to machine code using Scheme (through a
bunch of compiling functions (the LLL) which produce code for the
right platform).

> Sure we could do assembly directly, but except where performance
> is especially needed, a portable LLL is better (at least at first).

We would be doing assembly but indirectly, through a portable LLL. Do
you understand what I have in mind when I say that we'll do that but
in Scheme ? The LLL would be just an intermediate representation in
the compiling process. Humm.. the pieces are starting to fit in place
in my mind now. You suggested Forth as the LLL. Now that doesn't seeme
that bad to me since we'd be writing in Scheme higher-level functions
that generate Forth source and pretty soon we'd have high-level
functions to reduce to. And it is a popular approach in the implementation
of compilers (at least Scheme compilers) to have an intermediate stack-based
language. But the one we used in one of my classes to build a Scheme compiler
was pretty simple. Maybe we should define our own or at least be carefull
not to introduce a needlessly complex intermediate language.

> > Are we going to build
> > the HLL from there (in the LLL) by adding complex stuff ?
> > We'll already have written complex constructs in Scheme.
>    Scheme is too low-level to be our HLL:
> it lacks an expressive type system,
> has unclean reflection, though with hygienic macros and quasi-quoting).

Yes, I just though it could evolve into our HLL through modification of
its implementation.

>    It is too high-level to be an efficient LLL:
> it hides too much of implementation (notably as of memory access and
> number encoding) to allow both access to hardware and migration of high-level
> scheme constructs.

Yes, see what I've say before on the LLL.


> > Here's what I think we need :
> 
> >[Long speech]
>    Very good plan.
> I'm not sure whether we should or not use Scheme directly,
> or first implement a subset of the HLL over Scheme, that'd
> do everything from then on.

Yes. That's also a very popular way of building a system. Build an
interpreter for X first and then implemplent in X a compiler for X.
But if we are going to do that, we need to know now what we want in
the HLL and how to implement. And all the work done currently in
Scheme will have to be executable in the HLL eventually. It's not
really a problem but maybe we shouldn't go to fast doing that stuff
because it would probably be better if it were written directly in the
HLL subset (at least I've been using a kind of object system which
will make the result after the translation to the HLL a bit more
appropriate).

> > So ? Any questions ? (Dis)agreement ?
> Always the aforementioned stuff.
> If all of Tunes could be efficiently done entirely over an existing HLL,
> it would be pointless as an OS.

I don't want to do it over Scheme. I want to write a Scheme compiler
in Scheme and modify the compiler and bootstrap until it is a HLL
compiler written in HLL.

> >>    Let's learn the lessons from that m4 trial,
> >> and abstract as much as we can our implementation
> >> from the underlying Scheme system.
> > 
> > Do you mean write code that would be easily moved from Scheme
> > to another language ?
> Precisely, unless you can show that Scheme fulfills the requirements for
> our HLL.

Ok. I agree.

> > I hope no one is horrified by parenthetic languages. I prefer a nice
> > syntax but I think we have much to gain by living without one for a while.
>    I get bored by parentheses quickly,
> because they lack human-wise useful redundancy.
> But what you lose in human readability,
> you gain (with Scheme) in very nice and simple semantics,
> which are absent from brain-damaged languages we have to use everywhere else
> (like C, m4, sh, tcl) !

And if we want to translate our Scheme sources to another language, we
can write a Scheme program that (read)s our sources and translate them
automatically to the new language.

The question that remains is your question : do we do as I suggested
and start by writing a Scheme compiler in Scheme and evolve it into a
HLL compiler in HLL, or do we start with an HLL subset for which we
may have to write an interpreter first ?

We may not have to write an interpreter if we restrict the initial HLL
subset to things that can be defined in Scheme (like an object
system).  This would save us a lot of time in the short term and we
could be bootstaped earlier (we need to be bootstraped to implement
many hot HLL features like dynamic compilation, linking and
execution).  If we add only a bit to what Scheme offers in our initial
HLL subset, but still need an interpretation layer, then not only do we
face the cost of interepreting those added constructs but we also need
an interpreter for all those Scheme-level constructs (but maybe
we could use a free Scheme meta-circular interpreter to eliminate
that cost (I've writen one in my classes so we could use that)).
Note that an interpreter will slow our development cycles.

Even if our initial HLL subset departs from many elements of Scheme, I
suggest we still use its syntax until the semantics are sufficiently
stable to design an appropriate syntax (or a few of them).

I think I would favor adding a type/object system that does not
require a level of interpretation (I've used one before and think
there are others available, and we can make our own).

What else would you want to see in our initial language rather than
have it added later once we have started bootstraping ?

And if we define our initial HLL subset differentially from Scheme,
what Scheme constructs do we remove ?

Patrick