[Fwd: Language Syntax Suggestion]

Hans-Dieter Dreier Ursula.Dreier@ruhr-uni-bochum.de
Sat, 13 Mar 1999 01:05:28 +0100



Matthew Tuck schrieb:

> Hans-Dieter Dreier wrote:
>
> >> No they're compatible.  ASTs are augmented by the extension definitions
> >> used in the file to allow expansions as necessary.  How a textual
> >> parser will handle extensions with no defined parse shorthands is
> >> another question.
> > If some program can't handle something decently anymore because it has
> > no definition for it, I'd call that incompatible. What else?
>
> At the moment I don't see an alternative in a multi-viewing world.
> Extensions can be used in other views which will not be understood by
> other views.  But if they just translated to full form the extension
> would become expanded in the view that created it.

What's the difference between an extension and a shorthand? Could you please
give your definition of a shorthand, especially in what respect it differs
from a (general) extension? Maybe I was misled by my opinion that a shorthand
must be something like an abbreviation and thus was semantically the same.
Semantically here means that both (shorthand and equivalent non-shorthand)
would produce code that yielded the same result (though efficiency or size may
vary). It would imply that translations should be possible in both directions.
If translations were possible, it would effectively be something like an
optimisation at the AST level.

> The difficult part is designing the textual view so that it understands
> foreign extensions, which I don't see as being insurmountable.  Some
> standard rule could be used for parsing and deparsing of these, in the
> absence of a shorthand.

How would the compiler do its code generation for an unknown shorthand? If it
couldn't translate it, what would the ability to edit it be good for (except
for a better error report), if you couldn't process it further? If it could,
why would that shorthand be called "unknown"?

> > Either "while" or "do" is a noise word here (like "if x then ..." where
> > either if or then is a noise word). A loop is just one construct, so it
> > is represented by one node. It has just one parameter, the loop body.
> > It's an endless loop. Breaks are handled separately inside, so you can
> > have any number of them at any place as long as it's inside a loop.
> > Other loop constructs (for, do..while, while..od, you name it) can be
> > viewed as special (crippled) cases of an endless loop with breaks, so
> > nothing else is needed. Keep it simple.
>
> The problem with this is that it doesn't store in tree whether or not
> the extension is collapsed or expanded.  If you leave it expanded, which
> would be the simple thing, shorthands are basically typing conveniences
> since they won't stick.  Therefore they do not give readability and
> modifiability benefits.

Why can't a typing convenience increase readability? And if there were
automatic translation rules and the user could specify which one he wanted for
editing, why would it make a difference which one was stored in the AST?
Either the AST would already contain the desired form or it would be
automatically converted to that form. The user wouldn't notice.

> If you try to compulsorily collapse extensions you're left with
> extensions always being collapsed when you load them, which is
> probably not desired.  Plus the hassle of searching the tree for known
> extensions every load time.

This hassle would be the price to pay for the flexibility. But I think it
would maybe not be as big as you think here, since such conversions would
normally only be performed on editing / or compiling or once at load time. If
we want to be able to use different editors / different syntaxes for the
_same_ program, how could it be done otherwise? Wouldn't you think that such a
feature would be a great thing. Take Jeremy Dunn's syntax: He likes it, and
other people with a similar background would like it, too, while yet other
people would be confused (or at least might prefer some other representation).
Yet they could exchange programs and join work. I don't know of any other
language (except some Lisp dialects, maybe, to some very limited degree) which
would allow this. It were a real innovation, and a very useful one as well.

Because of the "hassle" (= slower development environment execution time) one
would like to keep the amount of translations limited that were available at
any given time, of course.

> Also, the extensions will not have any effect on other views.
> Extensions are usually defined since they are semantically useful rather
> than syntactically useful, and this suggests they are language-wide
> rather than view-wide.

Certainly. If you present an extension to an environment which does not
understand it, it cannot do anything than other than maybe display some
default representation (which or may not be useful) and refuse to compile that
stuff. And if a compiled version were included in the loaded workspace, it
would likely produce a link error since the extension would depend on some
class or primitive (MC) that were not present.

> > > Well whether the compiler will see shorthands is not important.  They
> > > could get expanded out in a preprocessing step, or we might leave some
> > > known ones there for efficiency reasons.
> > If they have an efficiency impact inside the generated code, can they
> > really be regarded as mere shorthands?Aren't they true extensions then,
> > rather? Otherwise the compiler could achieve hte same by optimisation.
> > Where do you draw the border between language extension and shorthand?
>
> Well they could have an efficiency impact in two ways - optimising and
> code generation could be faster since they are smaller, but also because
> higher level structures can have more said about them, therefore
> optimisation information would be known earlier.

Agree. Here you mention that shorthands "could get expanded out in a
preprocessing step" which implies that there has to be a translation rule at
least in that direction. Maybe translation in the other direction would be
somewhat awkward, but would it be impossible?

> The second way is through better optimisation.  Specific structures
> might have known properties as mentioned above, but the general
> structures are too hard.  A normal optimiser could actually try to
> recognise these structures, which is essentially collapsing shorthands.
> So since it can do it's own collapsing of this type of shorthand, it's
> unlikely to give this benefit in a good implementation.

OK, but I think we might be able to overcome this problem. I imagine a set of
(bidirectional) translation rules between AST representations A <-> B, each
with tags like "Use A->B if editing", "Use B->A for AST space efficiency",
"Use A-> to enable further optimisation" and so on. Translation rules would be
applied as long as a step in the desired direction could be made. If case of a
conflict between translation rules, a cost factor could help to resolve this
(and would also indicate the direction to take, by its sign).

> I think whether they're "language extensions" or "shorthands" is a
> semantic quibble, it doesn't really matter what category you put them
> in.

I don't think so. I clearly see at least two categories: Real extensions which
add functionality and cannot (sensibly) be expressed otherwise and extensions
for which another representation exists along with translation rules. The
latter I would call shorthands (precisely, the shorter edit form one of both).
A third category might a extensions for which only a translation in one
direction would be given. An example for this category might be an upgrade
facility.

> > How would you define the term "view"?
>
> Probably something like: a window which has code associated with it to
> allow the reading or writing or both of the program and associated
> entities.  I would call a "cooperative view" a view which keeps up to
> date with changes made in other views, or by the compiler (error message
> annotations mainly).  Cooperative views can obviously be used in
> parallel with other views.
>
> Examples would include structural editors, semi-structured textual
> editors, class browsers, error message displayers, extension definers,
> etc.  They would access the program through a specific API which is
> designed for views to be linkable to the editor at runtime as desired.

I agree.We should try to tightly integrate those. I don't like my desktop to
be cluttered by lots of windows.
Three types of editors should be enough:
- Graphical editors to draw windows and print reports.
- Outline (structure) editors for structed stuff (source, compiler error
messages, debug stuff), at best all included in one big outline that could
"spawn" restricted views on demand (restricted either by selecting a tree root
or by just displaying or skipping some node types / attributes) and could
itself be restricted by skipping some node types / attributes.
If everything that can be displayed in an outline is contained in the main
outline, it can't get "lost". The user interface is simplified since no
statically built-in buttons / menu choices are needed to bring up specialized
windows. Instead the user can design its toolbar / menu from standard
components, to spawn those few editor types parameterized according to the
user's specs or perform other tasks. The simpler the structure of the
development environment is, the better can it be customized. And the easier it
is to create, BTW.
- Text editors as an auxiliary device to show extenal data (text files and the
like).
Hopefully not needed very often.

> >> Yes, a func/proc that takes a func/proc reference/object.
> > So no (actual) parameters are included? Then it's nothing special, since
> > I regard references as data.
>
> If you're referring to standard imperative routine parameters, then yes,
> basically the same, except that you can pass closures (routines that
> are the result of partial application).

I understand this such that some or all of the actual parameters might or
might not be included in such an object. Hence it's a generalisation of a
simple func/proc reference like the ones found in imperative programming.

> Functional languages tend to have a lot of useful higher-order functions
> as standard.  Imperative languages with routine parameters tend to often
> lack them for some reason.  Jeremy touched on a few in his first
> message.

Yes, that potential is not supported well in most imperative languages.


BTW, maybe we should distinguish two "levels" (maybe you know a better word?)
that make up a program source.
First, there is a declarative level, starting from the root (module) and going
down to the single declaration.
Second, there is the imperative level, starting from a program body
(declarations not included) or other clause (e.g. initialisation expression
attached to a declaration), going down to the single token inside a program.

Declarative parts would probably not be included inside imperative parts, but
imperative parts need definitely be included inside declarative parts.
Declarative parts are extremely well suited for outline editors (which are
also good at imperative parts, to be sure)-

We should note in our discussion which level we are talking about. IMO this
was primarily the imperative level as far as shorthands were concerned. But
certainly extensions will be made to both of them as Ultra evolves; probably
less often to the declarative level since those tend to be more fundamental.


Regards

Hans-Dieter Dreier