Language Syntax Suggestion

Jeremy Dunn jeremydunn@ibm.net
Sun, 07 Mar 1999 19:14:12 -0800


Matthew Tuck wrote:

>Actually the term "collection" seems to be becoming the standard for
>discussing this sort of structure - set is reserved for unordered
>collections without duplicates, as with maths.  Lists are ordered
>collections with duplicates.  Arrays are often looked upon as some sort
>of collection, but I consider them to be functions from indexes to
>elements.  That is to say, you could take a set of indexes (domain), OR
>a set of elements (range), but it is not a set itself.

Fine, a COLLECTION then. I don't care what we call it just so long as we
all agree on what we are talking about. When I say SET I mean a
collection of items that may or may not be ordered and which may or may
not be of the same type and which may or may not have duplicates.

>Basically you're looking to be able to define literals for both sorts of
>collections.  I'm not entirely convinced there's a reason to do this
>though.  If you create a list literal, the conversion to any other type
>of collection is rather straight forward.  The only disadvantage is one of
>typing.  If assigning to a set you might want to ensure no dupes were
>entered, since it might be a bug somewhere.

Defining literals for both ordered and unordered collections can have
its advantages in certain cases. Let us say I write the expression
{A.3.4} (where A is the subtraction operator), this represents both the
expressions {A,3,4} and {A,4,3}. The result of the expression {A.3.4}
would then be a set of all possible results obtained by permuting the
arguments, or a set {,-1,1} in this case. There are certain functions
(string pattern matching comes to mind) where this is useful for
defining a range of inputs rather than explicitly typing them all out.

>This is basically what LISP does.  Everything is done with lists,
>including
>function applications.

This is true to a point. LISP does allow you to create lists like '(a b
c) but it does not treat what I call a FUNCTION SET as a list i.e. you
cannot cannot write something like (nth (+ 2 3 4) 1) to extract one of
the arguments to the expression. Being able to do this would eliminate
the need for many math functions that are necessary in other languages.
I will come back to this point later. LISP also does not allow you to
collect functions or arguments in the manner that I was proposing using
the apostrophe and caret, this feature reduces expressions and reduces
the amount of bracketing that one has to do. I tried this out on my CRC
math tables book with large lists of equations to experiment with
notational choices and found this to work far better than other choices.

>Just a question - why do you allow two different ways of writing
>strings,
>yet not two different ways of writing expressions?

I assume you mean why do I allow both (boat) and (,b,o,a,t)? In a holor
set like {,1,2,3} we have to write it this way because we have to insure
that we are not confusing it with {123} which would simply be the
integer 123 rather than the separate integers 1, 2 and 3. In a string
set it is fairly safe and convenient to make the assumption that
whatever is in the set is an ordered set. We want to do this so that we
do not have to double our typing by putting in all those commas. We do
have to add some extra punctuation if we have any unordered elements
however. For instance, if we write (,book.k.e,eper) we have to add the
two commas to indicate that the two parts of the string on either side
of the "ke" are ordered while only "ke" is unordered. We can use the
convention that the ordering remains the same until changed by a new
ordering. Doing this we can eliminate one more period and write
(,book.ke,eper), which would be equivalent to both the string
"bookkeeper" and "bookekeper". These types of conveniences only work for
strings and would be very ambiguous for numbers, that is why I made the
special case for strings.

>I thought that was a set?  Does that mean I can assign a complex number
>to
>a set of numbers?

This refers to a complex number being denoted as {,a,b}. Yes, a complex
number is simply a specific type of Holor set (or matrix). Any holor can
be part of another holor. You could write {{,a,b}{,c,d}} if you wanted
to. The addition, subtraction, division and multiplication operators all
become polymorphic and perform matrix-matrix, matrix-vector,
matrix-scalar, vector-scalar and scalar-scalar forms of the operation.
The arithmetic operators determine the kind and order of the input holor
and perform the holor algebra necessary with no need for separate
functions to do all of these tasks.

>I can't think of an functional language equivalent to this but there
>might
>be.  Where might you use this?

This refers to what I called the "collection of functions". This
situation admittedly does not occur as often as nesting or the other two
but does pop up now and then in certain situations, but I can't think of
a simple example that would be likely to crop up in the normal run of
things. Consider it a rounding out of the possible combinations of
function and argument collection.

>APL did too, but it hasn't been copied very often.  It's generally
>considered to be not very readable.

Not being familiar with APL I can't really comment on that but I can say
something about readability. I think readability is largely dependant on
what language the user is most experienced at using. I have used
AutoLISP extensively in my CAD programming and I find languages like
PERL or Visual Basic very cluttered looking. I can glance at a LISP
program and see the structure just by the grouping of the parentheses. I
have seen other people complain about being lost in parentheses but I
find them enormously comforting in that they clearly delineate what a
function is operating upon. I had this same feeling using Mathematica. I
guess it's a matter of taste. Readability also depends on the function
set available in the language, if the language has a large function set
of well chosen functions then one can write very concise code, if the
language has a small function set the user is forced to write all kinds
of nonsense that he shouldn't have to bother with. I feel a good
consistent syntax coupled with a good function set is the key to
readability.

>As you've probably read, we're looking at some sort of
>syntax-independent
>language.  Hence you could write your own syntax like this and be
>largely
>independent of those who don't make the same syntax tradeoffs as you.
>There's a fair amount of work to be done in this area though.  I imagine
>you'll be interested in designing special syntactic support for the
>underlying mathematical functions to support what you've detailed in
>this
>message.

If you can write a syntax-independant language that the user can modify
to his own personal syntax, that would definitely be the solution of
syntax disputes. I can't think of anything better than allowing each
user to write in the format that is most comfortable for them.

>Have you looked at functional languages before?  If not, definitely do
>so, there's a lot you'd be interested in in them

I have looked at PROLOG and like quite a bit what I have seen there.
There are so many languages out there that I have tried to pick those
languages which best represent each class of approach so that I can gain
some feel for what general methods have been tried. I certainly make no
claim at being totally up to date on what is going on out there. I think
my outlook is like most people, each language seems to have at least one
thing that it does better than the others and at least one thing that it
doesn't.

At this point I would like to throw out some more comments on basic
arithmetic operators and some unique things that we might do with them.
For the discussion I use the following function names:

			a,A addition/subtraction
			b,B multiplication/division
			c,C power/root
			d,d log/antilog

The function names are short because they are common operations. Upper
case function names help designate operations that are inverses of the
lower case ones and the letter progression is coincident with the
heirarchy of how one operation is an expansion of the preceding one. Any
of these operations may have multiple inputs i.e. {c,2,3,4} is the same
as {c,{c,{c,2},3},4} and so on. A notion that I would like to introduce
is the use of default arguments to these functions if certain arguments
are missing. For instance:

Addition: If the addition function has one argument it is assumed that
we are adding 1 to it. {a,x} is the same as x+1.

Subtraction: If the subtraction function has one argument it is assumed
that we are decrementing the value by 1. {A,x} is the same as x-1.

Multiplication: If the function has a single argument it is assumed that
you are multiplying it by -1, and if the 2nd argument is missing it is
assumed that you are multiplying by 2. Thus {b,x} = -x  and  {b,x,} = 2x
. Thus the changing of sign and the often occuring doubling of something
are taken care of.

Division: If the function has a single argument it is assumed that you
are taking the reciprocal of it, and if the 2nd argument is missing it
is assumed that you are dividing by 2. Thus {B,x} = 1/x  and  {B,x,} =
x/2 . Very very handy. One also notes that if x is a matrix then {B,x}
will return the matrix inverse.

Power: If the function is missing the first argument it assumes it is
the base of Naperian logarithms, and if there is a single argument it is
assumed to be 2. Thus {c,,x} = Exp(x)  and {c,x} = x^2 . No need for an
Exp function and squaring is easily indicated.

Root: If there is only one argument to the function then a square root
is assumed. Thus {C,x} = x^1/2 . I could think of no particular choice
as a default for the number being taken a root of, perhaps pi?

Logarithm: If there is only one argument then it is assumed that you are
taking the natural log of the argument, if the first argument is blank
it is assumed to be the base 10 or common log. Thus {d,x} = ln x and
{d,,x} = log10(x).

Antilog: Same defaults as the Log function.

It should be clear that these changes are very simple to make and exist
in no language that I can think of. We eliminate the need for several
functions and make many very common arithmetical procedures very simple.

Matthew questions the utility of writing a real number in the form
{r,x,y}, is there any advantage? Perhaps only in having a completely
rigid format, doing it this way enables us to use any operation which
can operate upon a set operate upon a real number that way too if we
wish.  Another reason would be that we are using the period to denote
unordered elements in collections and wish to avoid using the same
symbol for another purpose, ideally, one function name or symbol should
be used only once. By writing all operations in this way the user is
never bothered with learning rules of precedence or of context.
Precedence is only necessary in languages where the language creator
uses function names to separate arguments rather than using brackets and
commas. Why have precedence rules when you don't need them?

Well, that's all for now. I have some things to say about looping
structures and if-then structures but I'll get to that later.

Jeremy Dunn