Parse trees and syntax

Matthew Tuck matty@box.net.au
Thu, 03 Dec 1998 22:23:35 +1030


Tanton Gibbs wrote:

> Since we will be using parse trees to transfer information, there will
> be no necessity for those nasty syntax discussions.  However, what
> will be necessary is a description and inheritance heirarchy for all
> parse trees.  I suggest we start with the follwing:

We still have to design a syntax for programs given we don't have an
intelligent editor and it's not an easy feat to write one.  It can start
after bootstrapping.  The syntax will likely evolve into the standard
textual view, at least until we can design a textual view parameterised
by a pseudo-grammar.

That being said, talking about the language independent of syntax allows
us to get down to the heart right away rather than the picky details.

Shall we write a text parser/deparser immediately?  That is, a converter
to the USSID (Ultra Source Syntax Independent Distribution =) ), or just
go straight to the UPPID (Ultra Program Platform Independent
Distribution) at first?

>   ParseNode
>     Expression
>   ...

This is a very C-ish tree.

For example, the logical and bitwise operators.  I think you should
really have two types, boolean and bitpattern, and an "and" on both of
those types.  If you're dealing with integers, bitwise operations are
undefined if you want to hide the implementation of integers.  Things
like two's complement addition would still be available on bitpatterns
as that is a bitpattern operation.

I've talked earlier about side effects in expressions, and for that
reason I don't believe postincrement, etc., is a good idea.  I've never
used it in C++ other than as a statement, where the difference between
post and pre is irrelevant.  I find the idea highly unreadable.  Now you
might say we shouldn't stop programmers from making mistakes by
hamstringing them, but here, we're not really.  These is syntactic
additional overhead of a variable and an assignment, but readability is
improved, which is a lot more important than writeability.  I believe we
can do things with the language to reduce the desire to do this in the
first place as well.

Also, you've included n-ary subscript, presumably for arrays.  I feel
that we should remove this sort of prefential treatment for special
types.  The advantage is that user-defined types can use this notation. 
This can be done in two ways.

Sather does this by shorthands, that is mapping array indexing to a
method call.  In fact all Sather operators are mapped onto method
calls.  Now given that the extendible parse tree idea is not really
ready for use yet, leaving the operator nodes in the tree is fine. 
Later we can write a translator from our old tree format to the new tree
format.

But I have a more radical way up my sleeve which I'm still thinking
about (and would take top long to put into words at this stage).

> in if( X ) and also a pointer to a statement Y in if( X ){ Y; }
> Statements would have pointers to the next statement to be executed
> after itself.  What does everyone think about this layout?

I tend you think a block would have a list of statements rather than a
linked list structure.  I.e. the parse tree can have a potentially
unlimited number of children.  So basically the children are any
sequence rather than an inline linked list which implements the
sequence.

Also, you haven't put any modularisation features in yet.  Perhaps this
was intentional, but we can't really bootstrap without modularisation. 
I'll attempt to give my own idea of how this should be done soon.

-- 
     Matthew Tuck - Software Developer & All-Round Nice Guy
                              ***
       Check out the Ultra programming language project!
              http://www.box.net.au/~matty/ultra/