MOOSE Project -- new OS ?
Johan Van Schalkwyk
jvs@iconz.co.nz
Sun, 23 Oct 1994 00:37:21 +1200 (NZST)
Dear Francois-Rene
> I'm always interested in other people's ideas. Please tell me your ideas,
> or send me docs or code about your work. You're welcome. I myself have ideas
> about how assembler should be integrated in a high-level language (with
> completely parametrizable specifications about calling conventions, and
> efficiency, and possible optimizations).
Ok. Where to start. Perhaps a few ideas, or thumbnail sketches, first.
1. I have played around in several languages, but apart from assembler,
the only one that I have found to be more or less what I've been looking
for has been Lisp. It is however crippled by its parentheses (not just
"LISt Processing" but "Lots of Irritating Single Parentheses". Also,
parsing Polish notation from left to right is a real pain in the a***!
2. So I've played around with what I call "POWER" parsing (for POlish
With Evaluation in Reverse). Let me give you an example: In Lisp you
might write something along the lines of:-
Defun (Factorial (lambda (n)
(cond ((greaterp n 1) (times n (factorial (sub1 n))))
(T 1) ) ))
This may not be 100% kosher, but I haven't played around with Lisp itself
for several years. You more or less get the flavour, however.
What about the following syntax?
Define Factorial' ( Return 1
(Return Times n Factorial Sub1 n
If Greater n 1))'
This is evaluated as follows:
a. Start at the rightmost item: It is a quote mark, so the tree to the
left is quoted. Put this whole tree (or a pointer to it) onto your stack.
b. Move left past the tree, to the quote mark to the right of the word
Factorial. So we quote this in turn, putting it onto the stack.
c. Now we encounter the verb "Define". This takes two arguments off the
stack, and then attaches the tree (as a definition) to the newly defined
"verb" Factorial.
Then if we say (in this idealised function) eg:
> Factorial 2
It will be evaluated as follows:
a. go to the rightmost item. It is a number (2), so just put it on the
stack. Then move left to the word Factorial.
b. Factorial is a verb, so go to it's rightmost item and start moving
left (ignoring the parentheses).
c. Put the "1" on the stack, find the value of n (remember this is
idealised, I have obviously worked out ways to represent arguments etc),
and put it on the stack . Then evaluate the greater, and so on.
Ie a sort of Forth-like / Lisp-like approach.
I have been playing around with this simple approach for some time now,
and it has several merits. It is easy to read (provided it is set out
well) and evaluation is fast (even interpreted code moves along on an
8088 (!!) using a few tricks of my own devising). I have also made the
structure _extremely_ simple: basically the above defines practically the
whole language, and the only structures one needs are:
( something If condition )
( Loop count etc )
( End etc )
The idea being that the expression "something" is performed only if the
condition is true, and End moves to the corresponding right parenthesis
(and continues to do this forever, unless a Return or If is interposed
somewhere along the line. Loop is similar to End, but only performs an
"End" if the count specified is nonzero.
Note that there is no "If then else" construct. This is for several reasons:
a. It has been shown that
( something if condition)
( something_else if Not condition)
causes fewer programming errors and is more legible;
b. It keeps the syntax simple
c. With the above structures, evaluation is faciliated
d. Compilation is dead easy (yes I wrote a simple compiler)
e. etc.
3. This basically defines my top level. Obviously, there are a hell of a
lot of technicalities, etc. It is quite workable. I can control
practically every aspect of eg. a PC's function with about a hundred
verbs. Also, logical extensions to the language are easy and intuitive.
Let us say you have a verb AtX <number> that moves the cursor position on
screen to the position specified by <number> (leaving the Y position
unchanged). If you want to define a verb that finds the current X
position on screen, you _might_ define it as WhereX <number> (adding
another verb to your endlessly growing list), but I find it more mnemonic
to just have something like:
> AtX ? <number>
This has an added advantage (apart from being a good mnemonic) in that
you can have an error handler that traps bad syntax (eg a ? supplied as
an agrument for AtX) and simply "patch in" a new routine (even
dynamically?) to handle the extension to the language. Furthermore, you
can use a similar construct for other verbs. eg if > Go A means change
to drive A from the current drive, Go ? might mean "where the hell am I?"
4. A further note: I define a few very simple "verbs" and "nouns" (types
of atom, if you wish). They are:
a. nouns:
1. characters/immediates eg A , / ?
2. numbers eg 123
3. words eg Fred alpha
4. sentences eg (this is a (nested (tree structure)))
A word can take on any value, or even become a sort of "verb"
b. verbs:
1. Pre-Defined verbs eg If AtX
2. User defined "verbs" (I call them "friends") eg the Factorial,
defined above.
4. Okay, that's quite enough about my crazy language - you probably think
that I'm totally nuts, by now. I just wanted to give you a flavour of
what I'm playing around with.
5. Now, the assembler part. Most of the above (because of its simplicity)
is extremely easy to implement in assembler. Extensions are easy to add,
as explained. But there are also a lot of ideas that I have had (about
assembler design) that would make things even easier. I feel that one of
the reasons why languages such as C tend to be non-intuitive and ugly to
program in (my crazy opinion, anyway) is that the design does not stress
a minimal approach - they seem to never have heard of orthogonality.
Also, I really believe that whenever you program a computer to do
anything more than an absolutely trivial task, you are effectively
creating a language (even if it is just a very simple language where all
the instructions consist of pressing a few keys or clicking a mouse on a
menu). So if you are working on a high level, you are effectively working
within a whole hierarchy of complex and badly designed languages, and
trying to create your own little "language" on top of this. The lower
levels are meant to be transparent, but this is almost never the case.
The more complex the setup, the more likely an intermittent and
irritating error is. The solution as I see it is to keep things
absolutely simple (a la RISC), and rigorously exclude any instructions/
constructs that impair lucid interpretation on any level.
6. So I see any programming attempt as part of a continuum. There may be
several "levels", but one often has to have a good grasp of what is
happening on several of those levels, in order to be able to program
effectively!! Every time I have tried to do something more than slightly
challenging in a higher level language, I come up against this problem.
This does not mean that one must not define "levels", but one should be
able to understand with exquisite clarity what is going on at a lower
level, should the need arise. This also means that appropriate
instructions must be available on the relevant levels. What do I mean by
"appropriate"? A difficult question - I think that to a certain degree it
depends on what is available at lower levels, but much more important is
really what is intended on a higher level. For example, if your primary
goal is to have a magnificent graphics-orientated program on the top
level, it _may_ be entirely appropriate (in an idealised setup) to have a
primitive instruction on a lower level that eg "draws a line of a certain
colour between two defined points". It may be a lousy tradeoff to have to
(on that same level) plot a series of dots using complex manipulation of
memory and / or output ports. Unfortunately, we often seem to be driven
by the hardware designers .. they define what is available, we construct
our program (on all levels) around this, rather than saying "WE WANT
THIS. THIS IS OUR STANDARD. WE WILL DEFINE OUR SOFTWARE AS SUCH AND SUCH.
IF YOU CANNOT PROVIDE THIS FOR NOW, WE WILL STILL DEFINE THE STANDARD AS
SUCH, AND WILL JUST HAVE TO MUCK AROUND (ON A SLIGHTLY LOWER LEVEL OF OUR
SOFTWARE HIERARCHY) TO SIMULATE WHAT WE REALLY WANT" !!
7. The flipside of course is that one should know what one is dealing
with. If I ask for a number, I want to know _exactly_ how long it is, not
some arbitrary assurance from the programming language that it may be
such or it may be such, depending on the system on which we are running,
thank you for nothing! (If I go into a restaurant and order a steak, I
want to know if it is 250g or 500g)! I have a lot of ideas in this
regard, specifically with respect to assembler design. Tentatively, I
think that there should be just three levels:
Level 0: lowest level. May need considerable effort / ingenuity to
implement concepts defined on Level 1. But ideally, should all be ultra-
fast and implemented in hardware. All timing should be precisely defined,
and accessible to the next level.
Level 1: The "assembler" level. But cf my example above about drawing a
line! I have several ideas about what "ideals" should be implemented on
this level.
Level2: The "top" level. Perhaps something along the lines of the
incredibly simple language I gave you a flavour of above.
You may well find the above ideas too trivial to comment on, or disagree
totally. Please feel free to be completely honest! If you want to hear
more, just say so! Perhaps you could give me a few ideas of your own?
(I hope there are not too many typos in the above, as I composed it on
the fly).
Bye for now, JVS.