HLL and LLL, Scheme and Forth (long)

Jecel Assumpcao Jr jecel@lsi.usp.br
Tue, 30 Jan 1996 14:01:44 -0200


The recent discussion about language levels and Scheme
has been very interesting. I finally got a copy of the
Tunes files and took a quick look at them to see where
things are heading.

=== LAYERS IN MERLIN ===

I thought it might be interesting if I described how
language levels are handled in Merlin. There are many
ways to decompose my system, and one of them is into
four language levels as follows:

  1. Very High Level Language - Graphical Programming Environment

     At this level the important thing is the set of objects
     and their relationships. Objects are dragged around,
     split apart, joined together and so on.

  2. High Level Language - Self

     This gives you a textual equivalent of level 1 plus
     the ability to create new methods (which can then be
     used at level 1). The syntax is simple and fixed,
     though alternative syntaxes are possible.

  3. Low Level Language - bytecodes

     This describes a simple stack machine with only eight
     instructions - the main idea is that this code is
     easily generate from level 2 code. This level has a
     lot of missing functionality that must be provided by
     primitive functions. For Merlin, I decided to extend
     the virtual machine's instruction set with 27 new
     instructions that make it possible to reimplement the
     primitives in the bytecodes themselves. I patched level 2
     syntax to make it easy to generate these extra bytecodes,
     but the result can't be considered a real extension to
     level 2.

  4. Very Low Level Language - C and Assembly

     This level is used for bootstapping the system, as a
     target for translation from level 3 and also to implement
     functionality not available in the higher levels. The
     initial interpreter, parser, etc. are written in C but
     will be rewritten in Self once the compiler is working.
     The compiler will generate machine language from the
     bytecodes and link it dynamically into the system. While
     the extra bytecodes added to level 3 make it a complete
     Von Neumann machine, things like loading the page table
     register or flushing the cache will not be possible at
     level 3.

Levels 1 and 2 are very much integrated from the programmer's
point of view, while levels 3 and 4 are not visible. The
extra functionality of level 4 is a problem, and it was
encapsulated into a very special "CPU object" which has its
methods written in assembly, rather than Self. In this way,
level 4 is neatly integrated with all upper levels since it
is accessed with normal message passing.

Even though level 3 defines a very strinct memory model, it
can be used as a target for source languages other than Self
(even C!) since the 27 new bytecodes give it the needed
functionality.

=== SPIRAL MODEL ===

Would this model be a good one for Tunes? From what Fare'
has written so far it would seem so. But from the ideas that
Patrick has presented and the Tunes code that I have seen, it
is possible that a spiral model would be the best. In this
model an extendable language is implemented on top of itself
but there are no clearly separate levels. At any single
point in the implementation you seem to be able to see two
or more layers, but when you move "sideways" you fing that
there is a continuous path between the layers. Extending
some Scheme smoothly into a HLL and also down to the assembly
language level would be an example of this.

It is very hard to start from the lowest level and spiral all
the way up (which I think is what Fare's M4 Forth was trying
to do), so you might get the best results by starting in the
middle (with a "conventional" implementation of the language)
and work from there in both directions. It takes more
effort to keep things portable in a spiral model than a
layered one.

=== FORTH ===

Forth shares with C the low level access to the underlying
machine through address manipulation. At the same time it
offers a neat virtual machine (with two stacks) that is
easy to work with. It is a very extensible language, and so
can be used as a base for higher level systems. I don't
like the lack of a structured reflection and code manipulation
(the Postscript dialect is much better in this second aspect).

=== SCHEME ===

I am not too familiar with this dialect of LISP, but this is
a language that has been used as the "assembly of AI languages",
so it has a proven track record in this type of system (I never
saw one built in Forth, though I see no reason why it couldn't
be built). The distance between the memory model of Scheme and
the underlying machine is a problem.

One implementation of Scheme that looks interesting is RScheme:

        http://www.tkg.com/people/donovan/proj/rs/rscheme.html

I don't know what the various object models for Scheme are, but
I feel that adding objects should simplify the language, not
complicate it. A good object model might be based on the fact
that closures and objects are so similar, for example. 

=== HLL ===

The only way to get to an objective is to know what it is, and
it seems that the HLL is still a bit vague. We can devide
languages into two groups (says Alan Kay):

  - agglutination of features (COBOL, PL/1, Ada...)

    They are designed by a committee and tend to be large.
    It is easier for them to become popular since they
    "have something for everybody".

  - crystalization of style (LISP, APL, Forth, Self...)

    They are created by one or two people and tend to be
    very elegant. They are not as general purpose as the
    first group and tend to generate strong love/hate
    reactions which makes it harder for them to spread.

Given Tunes democratic nature and Fare's broad range of
interests, I think there is a strong tendency for the Tunes
HLL to belong to the first group. There is nothing wrong
with that, as long as it is a conscious decision.

If, on the other hand, the second type of language is preffered
for Tunes then the central idea for the language must be
decided. Only by knowing exactly where to aim can a good
base language be chosen and extended towards the final goal.

-- Jecel Mattos de Assumpcao Jr
   Laboratorio de Sistemas Integraveis 
   University of Sao Paulo - Brazil
   mailto:jecel@lsi.usp.br
   http://www.lsi.usp.br/~jecel/merlin.html