Long - Re: Sleeping? Language Design.

Joseph E. Van Riper III fleeb@mindspring.com
8 Feb 98 13:35:06 -0500


Yes, I'm badly behind.  Kindly shoot me.  You could only do me a
favor.

On Tue, Jan 13, 1998 7:01 PM, David E. Manifold
<mailto:tril@bespin.ml.org> wrote:
|
|On Tue, 13 Jan 1998, Fare Rideau wrote:
|
|> Again, this is not gratuitous criticism. Anyone who's passionate
about
|> language design should be able to do something original; the
problem is
|> that something that's original with respect to a primitive notion
of
|> computing is most likely to be a long-known thing; if you want to
be
|> original with respect to modern, elaborate, notions of computing,
do
|> learn them. We're in 1998, not in 1968. Don't do 1968 computing!
|
|Ahh, but we aren't inventing any new concepts.  I see Tunes as simply
a
|unification of many existing concepts.

Which is, in a subtle way, new.

Part of my background is music; music cannot be understood except from
relationships with prior forms of music.  That is to say, we tend to
prefer the music we're familiar with, and build new tunes (take the
pun if you will) from snippets of old tunes we enjoy.

For proof, try listening to pure Indian music.  It sounds like it's
out of tune, because they use a different system of temperment than
Westerners (incidentally, to some people's ears, our music sounds out
of tune to them, and frankly, if you ever listened to a work performed
in mean-tone temperment, then performed again in equal-tone
temperment, you'd think the equal-tone sounded out of tune, despite
the fact that it's the standard we use today.. but I badly digress).

Similiarly, the patterns and ideas we derive from computer system
design (whether soft- or hard-ware) are often derived from other
ideas; it does not demean the effort to combine old ideas in original
ways.  It's simply a way of being conservative.  Besides, changes in
hardware have prompted changes in software (eg: RISC archetectures
have become in vogue, due to cheaper memory, leading to larger
programs because  of the additional instructions required to
accomplish a task).

|Yes, I do intend on incorporating "notions" from both old and new
|languages into the Tunes metalanguage.  But you have found most of
the
|concepts yourself and put them on your web page!
|
|My reasoning for not looking into new languages is that I'm not
really
|designing a language.  I'm just describing the way we talk about
|languages.  Anyone can do that, but it requires a little abstract
|thinking (something I enjoy).

But if Tunes will not embrace a virtual machine, yet desires to handle
distributed computing ("Remote Method Invocation," to borrow a phrase
from Java), it must settle upon a standard language.  This language
will need to be sufficiently powerful enough to handle a broad range
of concepts, if it's to be Useful and Not Expedient.  Unless I miss my
guess, Faur=E9 makes his suggestion to study a variety of languages
not for the benefit of learning the languages, but to grok the
concepts behind them... that perhaps a new language combining all
these concepts might be entertained.

|The actual language features can be added after I initiate the
framework
|(consisting of the most abstract concepts).

A tricky proposition.  C++ offers and example of what can happen when
one adds features to a language after the initial framework.  And how
may one entertain the most abstract concepts without a broad-based
grounding in several styles of computer languages?

Sure, "loops" are in nearly every language, but multiple inheritance
is not.  And suppose one wished to entertain the idea of visually
representing the program instead of literally representing it, =E1 la
Prograph (eg. iconic verses character representation of data
structures and communications)?

We haven't even begun to discuss whether or not such concepts as
'multi-threading', 'synchronization,' or 'method-locking' are
sufficiently abstract enough to consider in the early design of a
language (or maybe we have and I'm just behind.. if so, my apologies..
I've been working 12 hour days so I'm lucky I even have a mind
anymore).  Heck, there are some issues that have names I can not
remember right now (eg. Amiga E allows a method to return more than
one value, and it also allows the use of a kind of 'two-unit' data
structure analogous to LISP that handles garbage collection, although
I have yet to figure out how to use it to greatest advantage.. which
speaks to a weakness in my understanding of this kind of data
structure).

|At this point, it is my opinion that there is plenty of work I can do
on
|Tunes that doesn't require me to learn any new languages.  The
features
|from your web pages should be enough.

That's probably pretty accurate; I suppose what I'm pointing out is
that a language for Tunes will likely need more time and effort than a
single person can carry out, simply because a single person doesn't
know all the kinds of languages out there.  It might be better to
assemble a core of people familiar with a variety of languages, and to
communicate how those ideas are best represented in a language, to see
if a common language can be derived from the collaberation.

I sorta see the possibility of laying down the language, then
modifying it to include all the interestingly useful benefits of the
various languages that currently exist, to create the final language
specification.  Perhaps this is exactly what you're saying, and I'm
just parsing it badly.

|My main concern is that Tunes proceed immediately.  Learning a
language
|would seem to slow me down.  If learning a language will ever help
Tunes
|speed up, I would be willing to do it.  But in general, I don't
believe in
|duplication of work.  You have done fine work on your web pages for
Tunes,
|and I see no reason to duplicate the research.  I'm sure you and the
list
|can be a fine advisor for any questions I may have about languages
and
|language features.

Jimmy Kerl wrote:

> o  The value of a language is based upon 2 things:
>         1. Extensibility of the language.  --  Being able to define
>              user defined types, functions and other identifers.
>         2. The ability of the Language to interact with the
operating system.
>    A language must do BOTH of these well to be a good language.

Faur=E9 stated that this was too vague, although I think occasionally
a nicely vague way of expressing a concept can help a person to see
the forest while the details show the trees.  Which is to say, yes,
it's vague, but still somewhat useful.

The only major problem I have with Mr. Kerl's broad expression is that
it ignores the user.  Rather, I'd say:

The value of a language is based upon 2 things:
	1. Extensibility of the language -- Being able to define user defined
types, functions and other identifiers.
	2. The ability of the language to facilitate communications between
the machine and the user.

Assembly offers an excellent example of a language that is extremely
extensible (particularly where Macro assemblers come into play) and
interacts very well with the operating system, but leaves the user
screaming horrible curses in a variety of languages quite foreign to
the damned machine.

However, Smalltalk provides a wonderful example of a language that is
extremely extensible, interacts very well with the user (at least in
my opinion), but slogs on the machine since it's hampered pretty badly
with late-binding everything and virtual machines and blah blah blah.

I'd love to see a version of Smalltalk that could speak to the machine
better.  Or better yet, see a dual language that combines iconic
representation with literal flexibility allowing one to painlessly
decide to late- or early-bind a method (=E1 la MOPS for MacIntosh,
except MOPS doesn't have as friendly a representation as Smalltalk,
and isn't iconic at all).

Years ago, I had suggested something like the virtual machine idea,
and was slapped down by Faur=E9.  Very recently (today), the same
thing has kind of happened, except this time someone put up an
argument for the virtual machine.  Now, I understand Faur=E9's
resistance to this idea, and have to agree with it.  And I also
understand that there was a misunderstanding as to my original
proposition.

Is it more important for a user to use whatever language they want,
and to be able to compile their creation on whatever platform Tunes
has been ported to, or is it more important to have an efficiently
running program and cause the user to use a single language for the
operating system (or endure 'hacks' that convert from another language
to the common one)?  Are these two even mutually exclusive?

Is it possible to design a single language that embodies all the
variety of concepts that currently exist in the various programming
languages available today, make this language port across every
platform Tunes finds itself designed for (presuming Tunes intends to
find itself on a wide variety of platforms, to include PC, Mac, Amiga,
and god forbid, Atari ST), yet be easy enough to use that it doesn't
require reading a huge tome?  And what about foreign languages?  Would
it be possible to easily modify the language to provide international
support (eg. French, German, Japanese, Hebrew, etc)?  And how about
converting from one human-language over to another human-language,
while maintaining the same machine-language?  Then there's reflection;
if I create a method called "spiffyMethod" how do we translate it's
name to something a non-English speaking person would understand?  Are
these even issues to concern ourselves with?

Looking from the bottom up, I have to agree with Faur=E9; using a
virtual machine is a bad idea because they're notoriously inefficient.
 Looking from the top down, I can't completely agree with Faur=E9; a
single language for use on an operating system offers a major
restriction that alienates everyone but those willing to work with the
language from using it.

However...

A possible compromise might be to design a tokenized language that is
scanned just as text files with real programs are scanned, processed,
and compiled into something the machine and operating system can cope
with.  The downside is that one might have to compile the code twice;
first from the human language to the tokenized language, and then from
the tokenized language to the machine language (unless someone wants
to write code in the tokenized language directly.. but I'd presume
this would look ugly).

However, this might not be too bad; one only needs the tokenized
language for distributed computing across platforms... one could
instead reserve the use of a tokenized language for distributed
computing.  Therefore, one could design a language such that it allows
certain methods to have this dual-compiled nature.  If the method is
to be distributed across platforms, it gets the tokenized language and
the machine language output.  When the method is requested across a
communication channel (disk, network, serial port, whatever) the
tokenized language is sent, compiled on the other end into native, and
kept there unless/until the system wants to be rid of it.. which in
some cases might be never (eg. updates to the operating system)

This was my original idea, as I wished I had stated it in 1995 when I
first joined this list, but I mistakenly said "virtual machine".  I've
been cleaning up my vocabulary a bit, trying to figure out a better
way to express this idea.

I see several benefits to this.

1. Developers may distribute their code via network/disk/cd/whatever
without fear of the platform being inappropriate.  Sort of like Java's
write once, run anywhere, but without the overhead, and limited to the
Tunes operating system.  I'm not talking about a virtual machine. 
Perhaps a virtual compilor.. I dunno.

2. One gets the benefits of a common language without the runtime
penalties (unless, of course, one codes poorly.. but more on this
concept).

3. The developer is free to work with whatever language s/he desires,
provided the language supports conversion to the tokenized language. 
Folks designing languages to work for Tunes would find themselves
supporting this tokenized language anyway, if they wanted to take
advantage of Tune's benefits.

4. The paranoid developer would have 'uglified' code (provided the
tokenized language is sufficiently nasty to look at).  Heck, wrap some
nasty encryption scheme around it and you might even have secure ugly
code.  Frankly, I think code should be 'free to roam', but as was
state, "As long as there's money..."

There's one big potential problem I can see...

If the language the developer uses converts to another language with
its own idiosyncracies, how will s/he be able to ensure s/he optimizes
their code to best work in the other language from within his/her own
language?

That is, suppose you have (in a made-up language, of course):

A:
---
wait_state:
   unless: thisCondition goto: wait_state;
[rest of code]
---
versus B:
---
wait: thisCondition;
[rest of code]
---

After everything is converted, B works more efficiently than A, since
it doesn't waste clock cycles.  But how would the programmer know? 
Maybe A works more efficiently because it takes advantage of some
optimization feature in the tokenized language that B avoids?

I'm somewhat inclined to believe that this problem, in one form or
another, would exist no matter what, but I suspect it might be made
more difficult to deal with given the extra language layer.

I say again, I'm not suggesting a virtual machine.  I am not thinking
there would be some daemon reading a stream thrown at it and either
spewing machine-code that disappears after a garbage collector eats
it, or interprets the thing byte by byte (or even word by word),
carrying out instructions.  I'm thinking of code (perhaps a daemon, if
it's appropriate) that reads a stream thrown at it, compiles it into
machine-code, and saves it to wherever it should be stored.. if stored
in memory, it disappears and would need to be recompiled again, ergo,
'virtual' in nature.. if stored on harddisk or the like, it may be
used just like a regular program.  The stream it reads would be just
like any old language, except in this case the language is tokenized
so the stream could be read (if marginally) faster than a standard
text file.

As I read over some of the things in the e-mail discussions, I'm
beginning to believe this was the plan all along, and I'm just
rewording things so I can grasp it better.  That would be the
journalist in me, I suppose (yes, I've had a bizarre past.. I'm sure
my future won't be much different).

If this is indeed the case, I suppose that starting on the language
early on won't be much of a problem; a tokenized language can be
extended more easily than a non-tokenized language, since the
tokenized language isn't intended to be read by humans.  That is, you
may define a token to represent you're dealing with a data structure,
or another token to represent a procedure of some kind, then modify
each of these tokens to further describe the nature of the data structu=
re or procedure, while other tokens describe fairly simplistic ideas
(loops, labels, conditionals, blocks of information, etc).  As a new
concept desires to be incorporated into the tokenized language, an
appropriate new token is created, and the compilor modified to handle
the new token (although optimization becomes an interesting bugaboo).

- Trey