MOOSE Project -- new OS ?

Johan Van Schalkwyk jvs@iconz.co.nz
Sun, 23 Oct 1994 00:37:21 +1200 (NZST)


Dear Francois-Rene

> I'm always interested in other people's ideas. Please tell me your ideas,
> or send me docs or code about your work. You're welcome. I myself have ideas
> about how assembler should be integrated in a high-level language (with
> completely parametrizable specifications about calling conventions, and
> efficiency, and possible optimizations).

Ok. Where to start. Perhaps a few ideas, or thumbnail sketches, first.

1. I have played around in several languages, but apart from assembler, 
the only one that I have found to be more or less what I've been looking 
for has been Lisp. It is however crippled by its parentheses (not just 
"LISt Processing" but "Lots of Irritating Single Parentheses". Also, 
parsing Polish notation from left to right is a real pain in the a***!

2. So I've played around with what I call "POWER" parsing (for POlish 
With Evaluation in Reverse). Let me give you an example: In Lisp you 
might write something along the lines of:-

Defun (Factorial (lambda (n) 
	(cond ((greaterp n 1) (times n (factorial (sub1 n))))
		(T 1) ) ))

This may not be 100% kosher, but I haven't played around with Lisp itself 
for several years. You more or less get the flavour, however.

What about the following syntax?

 Define Factorial' ( Return 1
			(Return Times n Factorial Sub1 n
			If Greater n 1))'

This is evaluated as follows:

a. Start at the rightmost item: It is a quote mark, so the tree to the 
left is quoted. Put this whole tree (or a pointer to it) onto your stack.
b. Move left past the tree, to the quote mark to the right of the word
Factorial. So we quote this in turn, putting it onto the stack.
c. Now we encounter the verb "Define". This takes two arguments off the 
stack, and then attaches the tree (as a definition) to the newly defined 
"verb" Factorial.

Then if we say (in this idealised function) eg:

> Factorial 2

It will be evaluated as follows:

a. go to the rightmost item. It is a number (2), so just put it on the 
stack. Then move left to the word Factorial.
b. Factorial is a verb, so go to it's rightmost item and start moving 
left (ignoring the parentheses). 
c. Put the "1" on the stack, find the value of n (remember this is 
idealised, I have obviously worked out ways to represent arguments etc), 
and put it on the stack . Then evaluate the greater, and so on.

Ie a sort of Forth-like / Lisp-like approach.

I have been playing around with this simple approach for some time now, 
and it has several merits. It is easy to read (provided it is set out 
well) and evaluation is fast (even interpreted code moves along on an 
8088 (!!) using a few tricks of my own devising). I have also made the 
structure _extremely_ simple: basically the above defines practically the 
whole language, and the only structures one needs are:

	( something If condition )
	( Loop count etc )
	( End  etc )

The idea being that the expression "something" is performed only if the 
condition is true, and  End  moves to the corresponding right parenthesis
(and continues to do this forever, unless a Return or If is interposed 
somewhere along the line. Loop is similar to End, but only performs an 
"End" if the count specified is nonzero.

Note that there is no "If then else" construct. This is for several reasons:
	a. It has been shown that 
		( something if condition)
		( something_else if Not condition) 
	causes fewer programming errors and is more legible;
	b. It keeps the syntax simple
	c. With the above structures, evaluation is faciliated
	d. Compilation is dead easy (yes I wrote a simple compiler)
	e. etc.


3. This basically defines my top level. Obviously, there are a hell of a 
lot of technicalities, etc. It is quite workable. I can control 
practically every aspect of eg. a PC's function with about a hundred 
verbs. Also, logical extensions to the language are easy and intuitive. 
Let us say you have a verb AtX <number> that moves the cursor position on 
screen to the position specified by <number> (leaving the Y position 
unchanged). If you want to define a verb that finds the current X 
position on screen, you _might_ define it as WhereX <number> (adding 
another verb to your endlessly growing list), but I find it more mnemonic 
to just have something like:

		> AtX ? <number>

This has an added advantage (apart from being a good mnemonic) in that 
you can have an error handler that traps bad syntax (eg a ? supplied as 
an agrument for AtX) and simply "patch in" a new routine (even 
dynamically?) to handle the extension to the language. Furthermore, you 
can use a similar construct for other verbs. eg if > Go A  means change 
to drive A from the current drive, Go ? might mean "where the hell am I?"

4. A further note: I define a few very simple "verbs" and "nouns" (types 
of atom, if you wish). They are:

a. nouns:
	1. characters/immediates eg   A  ,   /   ?
	2. numbers    eg  123  
	3. words   eg Fred alpha
	4. sentences  eg   (this is a (nested (tree structure)))

	A word can take on any value, or even become a sort of "verb"

b. verbs:
	1. Pre-Defined verbs   eg  If  AtX
	2. User defined "verbs" (I call them "friends") eg the Factorial,
		defined above.

4. Okay, that's quite enough about my crazy language - you probably think 
that I'm totally nuts, by now. I just wanted to give you a flavour of 
what I'm playing around with.

5. Now, the assembler part. Most of the above (because of its simplicity) 
is extremely easy to implement in assembler. Extensions are easy to add, 
as explained. But there are also a lot of ideas that I have had (about 
assembler design) that would make things even easier. I feel that one of 
the reasons why languages such as C tend to be non-intuitive and ugly to 
program in (my crazy opinion, anyway) is that the design does not stress 
a minimal approach - they seem to never have heard of orthogonality. 
Also, I really believe that whenever you program a computer to do 
anything more than an absolutely trivial task, you are effectively 
creating a language (even if it is just a very simple language where all 
the instructions consist of pressing a few keys or clicking a mouse on a 
menu). So if you are working on a high level, you are effectively working 
within a whole hierarchy of complex and badly designed languages, and 
trying to create your own little "language" on top of this. The lower 
levels are meant to be transparent, but this is almost never the case. 
The more complex the setup, the more likely an intermittent and 
irritating error is. The solution as I see it is to keep things 
absolutely simple (a la RISC), and rigorously exclude any instructions/ 
constructs that impair lucid interpretation on any level.

6. So I see any programming attempt as part of a continuum. There may be 
several "levels", but one often has to have a good grasp of what is 
happening on several of those levels, in order to be able to program 
effectively!! Every time I have tried to do something more than slightly 
challenging in a higher level language, I come up against this problem. 
This does not mean that one must not define "levels", but one should be 
able to understand with exquisite clarity what is going on at a lower 
level, should the need arise. This also means that appropriate 
instructions must be available on the relevant levels. What do I mean by 
"appropriate"? A difficult question - I think that to a certain degree it 
depends on what is available at lower levels, but much more important is 
really what is intended on a higher level. For example, if your primary 
goal is to have a magnificent graphics-orientated program on the top 
level, it _may_ be entirely appropriate (in an idealised setup) to have a 
primitive instruction on a lower level that eg "draws a line of a certain 
colour between two defined points". It may be a lousy tradeoff to have to 
(on that same level) plot a series of dots using complex manipulation of 
memory and / or output ports. Unfortunately, we often seem to be driven 
by the hardware designers .. they define what is available, we construct 
our program (on all levels) around this, rather than saying "WE WANT 
THIS. THIS IS OUR STANDARD. WE WILL DEFINE OUR SOFTWARE AS SUCH AND SUCH. 
IF YOU CANNOT PROVIDE THIS FOR NOW, WE WILL STILL DEFINE THE STANDARD AS 
SUCH, AND WILL JUST HAVE TO MUCK AROUND (ON A SLIGHTLY LOWER LEVEL OF OUR 
SOFTWARE HIERARCHY) TO SIMULATE WHAT WE REALLY WANT" !!

7. The flipside of course is that one should know what one is dealing 
with. If I ask for a number, I want to know _exactly_ how long it is, not 
some arbitrary assurance from the programming language that it may be 
such or it may be such, depending on the system on which we are running, 
thank you for nothing! (If I go into a restaurant and order a steak, I 
want to know if it is 250g or 500g)! I have a lot of ideas in this 
regard, specifically with respect to assembler design. Tentatively, I 
think that there should be just three levels:

Level 0: lowest level. May need considerable effort / ingenuity to 
implement concepts defined on Level 1. But ideally, should all be ultra-
fast and implemented in hardware. All timing should be precisely defined, 
and accessible to the next level.

Level 1: The "assembler" level. But cf my example above about drawing a 
line! I have several ideas about what "ideals" should be implemented on 
this level.

Level2: The "top" level. Perhaps something along the lines of the 
incredibly simple language I gave you a flavour of above.

You may well find the above ideas too trivial to comment on, or disagree 
totally. Please feel free to be completely honest! If you want to hear 
more, just say so! Perhaps you could give me a few ideas of your own?

(I hope there are not too many typos in the above, as I composed it on 
the fly).

Bye for now, JVS.