Language Syntax Suggestion

Jeremy Dunn jeremydunn@ibm.net
Mon, 01 Feb 1999 10:22:58 -0800


I've come in late in the discussion but I have worked my way through
past messages on the project and my head feels like its going to blow up
with all the commentary to absorb. Fortunately it appears that the
conversation hasn't yet covered some of the things I wish to introduce
to the table - which is good.

My interest in computer languages revolves around the symbolism that one
actually has to type in to write the desired code. Every language I have
run across suffers from syntax inconsistency. What do I mean? A language
like C has many functions in the form func(arg1,arg2...argN) but then
will allow you to write 2+3+.. , an expression of the form arg1 func
arg2 func...argN. To this I say MAKE UP YOUR MIND! Language writers seem
to be pretty loose about creating mutually contradictory ways of writing
things. It is my belief that we must pick a function syntax and stick
with it without any exception in the language, this makes for simpler
compilers and reduces the mental overhead that the programmer must keep
track of to write code. Rather than go on and on about how bad every
language in the universe is I will simply throw out my proposed general
syntax and give some examples of how I think it is superior than what is
currently being done in most languages.

Programmers use the terms set,array and list interchangeably but I will
use the term SET in my discussion in respect to the science of
mathematics which predated computers to begin with. I denote a general
set of items of any type as [,a,b,...,n] where the square brackets
indicate a general set. We note that the elements of the set are
seperated from each others using commas, we also note that the first
element has a preceding comma. What is that all about? A comma preceding
an element of a set denotes that the element which follows it is an
ORDERED element i.e. it must be read in the order going from left to
right that it appears in. If we precede a set element with a period as
in [.a.b] then we are indicating that the elements are UNORDERED. So the
set [.a.b] is equivalent to either of the sets [,a,b] or [,b,a]. It is
clear that a set with unordered elements must have AT LEAST 2 unordered
elements, it makes no sense to write [.a].

A FUNCTION is a set [funcname,arg1,arg2...,argN] in which the first
element (the function name) does not have a preceding comma. The lack of
comma enables our parser to find the function name if one exists. We
have two other kinds of brackets to denote two other special kinds of
sets. We use the left and right parentheses to bound a STRING SET i.e.
the string "house" would be written as (house). We may also write this
as (,h,o,u,s,e) if we desire. If there are no commas in the string set
then they are assumed by the program. Strings are a whole seperate issue
and I do not propose to get into their syntax in detail in this first
email except to indicate that they are a set with elements of a
particular type and are treated syntactically like any other set.

The final type of set I call a HOLOR SET. Holor is a term introduced by
Parry Moon and Domina Spencer in their book "Theory Of Holors".
Basically a holor can be thought of as a nth order matrix. Integers,
real numbers, vectors, complex numbers and matrices are all holors. A
holor set is bounded by the curly brackets { and }. Integers can be
written normally as in 243 or as {243} if we wish to indicate their
general holor nature. A real number like 2.34e24 would take the form
{r,234,25} where "r" indicates a real number function that takes the
integers 234, attaches a decimal point to the front to get .234 and then
multiplies it by ten to the 25th power. The number 0.234 would simply be
written as {r,234} where the power takes the value of 0 if it is
omitted. The brackets of a function set are of the type of the arguments
which it takes, and all the arguments that the function acts upon must
be of the same type. If the arguments are not strings or numbers or are
of more than one type then square brackets must be used. A complex
number a+bi is written as {,a,b}. A 2x2 matrix would be written as
{{,a,b}{,c,d}}. We note that we do not need to put a comma between
arguments that are bracketed unless we wish to indicate that they are
unordered, this saves us from having to write {,{,a,b},{,c,d}}.

In order to further describe function syntax I must introduce the basic
arithmetic function symbology so that I can write examples. The basic
arithmetic notation is:

                x+y     is written as   {a,x,y}
                x-y     is written as   {A,x,y}
                xy      is written as   {b,x,y}
                x/y     is written as   {B,x,y}

REVERSE ITERATION

An expression such as s-t-u-v can be written as {A,s,t,u,v} where this
is the same as writing (((s-t)-u)-v). This successive application of the
function from left to right is the standard form of iteration that
occurs. We may also write the previous expression as {A;s;t;u;v} where
the semicolon indicates that we are performing REVERSE iteration on the
arguments. The expression {A;s;t;u;v} is the same as (s-(t-(u-v))).
Using reverse iteration allows us to easily make backward constructs
such as continued fractions without any difficulty.

FUNCTION NESTING

Suppose we have an expression of the form [Z,[Y,[X,p]]] where three
functions X,Y and Z are being applied successively to the argument p, we
are allowed to rewrite this as [Z'Y'X,p] where the apostrophe indicates
the nesting of the functions. This provides a clearer picture of what is
going on.

COLLECTION OF ARGUMENTS

If we have an expression of the form [b,[x,p][x,q][x,r]] we note that
the same function "x" is being applied to three different arguments p,q
and r. We can rewrite the expression as [b,[x,p'q'r]] where the
apostrophe is now indicating that "x" is to be applied successively to
p,q and r.

COLLECTION OF FUNCTIONS

If we have an expression of the form [b,[x,p][y,p][z,p]] we now have a
situation where the argument is the same but the functions are
different. We may write this as [b,[x^y^z,p]] where the ^ indicates the
collecting of functions upon an argument.

These three notational shorthands result in vastly reduced need for
bracketing and increases the readability of the expression. One notes
that we may have 2nd level and 3rd level collecting of terms by
repeating the superscripts to indicate the order of compression, a ^^
takes priority over a ^, and a '' takes priority over a '. An example of
this would be the expression [a,[b,x^y^z''X^Y^Z,p]] which would
represent the full expression
[a,[b,[x,p][y,p][z,p]][b,[X,p][Y,p][Z,p]]].

I am sure that at first glance that this notation must seem pretty alien
but it has the virtue of being consistent in its methodology and results
in extremely compact expressions when one gets the hang of it. It seems
logical to me to treat everything as a set of elements, this has the
advantage that any function that we devise to operate on sets will
operate upon a set of ANY type of item. For instance, suppose we denote
the union of set X and set Y as Jn[,X,Y]. The expression
Jn[(house)(boat)] would perform string concatenation. We could also
write Jn[,354,68] and get 35468 as the result, we get GENERAL
concatenation rather than a specific form of concatenation. This kind of
generality eliminates the need for the programmer to remember several
functions which are really performing the same fundamental operation.

I only briefly touched upon the arithmetic operators, but suffice to say
that these operations should be overloaded to be operations upon general
matrixes so that one has the complete complement of complex addition,
vector addtion etc all subsumed within the function. Our programmer
should not have to be a mathematical wizard and create a matrix multiply
so that he can do something that should have been provided for him. Many
languages have terribly inadequate math functions.

I think I will cut off at this point, I don't wish to go into details on
individual functions that I desire until I get some kind of sense back
from the group as to whether they like or despise any of this. To my
mind the only point of writing another language is really to develop a
better syntax or function set because it is primarily those areas which
are the point of aggravation to most people.

Yours,

Jeremy Dunn