On design processes

Hans-Dieter Dreier Ursula.Dreier@ruhr-uni-bochum.de
Tue, 08 Dec 1998 01:40:53 +0100


> I guess that sort of information is strewn around the web pages and not
> centralised.  A document would be useful.  As for intelligent editors,
> there are a lot of issues to resolve.  It is obviously possible, but the
> details are important.
>

It would really be good if you could manage to get a copy of SqlWindows or Centura Builder. Apparently the
benefits of editing in an outline tree are hard to communicate if you never tried it yourself. It's quite
different; all those issues like open/closed form, syntax of declarations and the like simply vanish because the
outline editor automatically takes care of that.

> There are some other essays on the web pages "open source links"
> section.

Which ones do you mean? I haven't found any that describe the technical aspects of managing an open source
project.

> It could be useful to start writing some summaries of what has been
> discussed?

Good idea.

>
>
> > How about a table-driven (to be extendable) parser without any frills
> > attached and an assembler-like syntax to directly produce objects and
> > crude code for a most simplistic stack machine? So that it can output
> > "Hello world" to stdout?   There might even exist some code somewhere > that can serve as a starting point.
>
> I was thinking more in terms of interpreting programs stored as trees.
> I think it would be easier to start with.  The slim binaries proposal
> actually seems to indicate this sort of program form can be a fairly
> efficient system.

Sure. The slim binaries proposal may be something for the future. I also thought about starting with an
interpreter. It can be turned into a compiler quite simply by replacing the "output" functions. Actually, it
should be designed for easy exchange of the output part from the very beginning.

> I don't know that using a low-level syntax is a good idea.  What I would
> do is write some stuff in high-level code and embed the implementation
> in the interpreter for initial implementation.

If you start with some low-level syntax, that will be much quicker than a high-level one. If the parser is done
right, there will be no problem extending it. Example: Replace the initial symbol lookup function (which would
only handle a single unstructured namespace) by a version that looks into the operand stack of the parser first,
to determine the namespace to be used, as soon as functions and namespaces are supported by the syntax.

> This is a possible idea.  However, when you build parsers you have to be
> careful with your grammar, since different parsers can only process
> certain grammars.  LR parsers are supposedly fairly general and are
> fairly adapted to the table sort of approach.  Most compiler generators
> use LR parsing, so to use a parser generator might still be a better
> idea.  If we're using C++, then bison is the usual choice.  It could
> convert directly to the source parse tree.

I'd like to use a bottom-up, LR(1) parser, table-driven, with some "hooks" that allow for easy treatment of
special cases (such as control structures and parentheses). But the C++ code should be independent of the syntax
used, except for those "hooks".

> I'm not too sure what you mean by describing other objects, did you mean
> the capabilities you were referring to before about allowing the editor
> to adapt to the objects more intelligently via active method calls?
>

Yes, and the compiler as well.

In the editor, the structure of the layout tree and the behaviour of the tree items would be affected. That would
mean that the editor would process declarations, greatly simplifying overall language syntax (at least if you
don't use plain text input, which I would keep as an option as well).

In the compiler, the behaviour which is needed for declaration modifiers like "private" could by put into
objects, so that even these things could be "plugged in" with little changes to the compiler's innards.

> Most of the parsing work doesn't really get removed, just moved into the
> editor, at least with textual editing.  In fact, with incremental
> compilation, you're really merging the entire editor and entire
> compiler.

True - editor's behaviour and compiler depend on each other, so if you make a major change to the language,
chances are that the editor *and* the compiler will have to be adapted.

> One thing I learnt about in Uni was recursive descent
> parsers.  They emit some strange messages sometimes, since they have
> code like:
>
> if have("if") then if_statement
> elseif have("for") then for_statement
> elseif ...
> else musthave("while")
>
> As a consequence if the next token is a "procedure", it says "while"
> expected, whereas it really expected any statement.

Actually, if you take that to an extreme, it would just point to the start of your module and say: "Not a valid
program" (because this is the outermost production rule). It's not so easy to determine at which level to produce
an error message. Bottom-up parsers generally are better on this.

> Don't get this wrong - parsing in intelligent editors is very tricky.

Parsing, yes. If it has to be done interactively (the user changes something in mid-program). But it would only
be complicated if input were free-text, not if input is already done into an outline. Then the editor will always
know what sort of item you are editing at some particular place.

Just processing an input file to produce an outline for the editor is relatively easy because there is nothing
interactive to it. In fact, we could use the compiler with a special syntax fitted for that purpose.

> It is not well understood, and in that sense, we will be treading on
> loose ground.

To me, it doesn't seem so complicated, although I have to admit that I haven't figured out everything yet. But
I've been working with such a tool for a long time (SqlWindows, which I already mentioned). And, boy, what an
opportunity to do something which is not just main stream, but better!

> > If comp.gen's produce tables, to be processed by some (syntax
> > independent) kernel, that would be better.
>
> They can produce anything you want to, for example in bison there are
> places to put C.  They're more like a framework.

You mean you could replace the output section of bison to generate a syntax table for a bottom-up-parser?

> To produce code at runtime using bison we would need to be able to
> interface with C.

Why that? It only makes sense to me if it generates C source, which I wouldn't like.

> > But I actually thought of being dependent on the Java VM, for example -
> ...
>
> Yes I was thinking about that, but I was never saying we should restrict
> ourselves to JVM long term.  I think that we should leave the back-end
> open for whatever target you want.

In a truly integrated environment like that which I think of, there is no "back end", because part of the source
code may need to be executed by the compiler and/or editor while development is in progress. Such a thing would
not be possible if the "back end" could not host our development system in a sensible way, and be changed while
running.

Maybe output to JVM could be done as a separate run after development has been completed, but debugging or any
other form of interaction with the source code while the program was running would be almost impossible. That
would be a step backwards compared to current tools.