Lists, Tables, Psets (Was: Joy, Dolphin, ...)

iepos@tunes.org iepos@tunes.org
Tue, 28 Mar 2000 10:01:03 -0800 (PST)


> > The use of a stack is
> > quite fundamental to our approach; it is the way in which parameters
> > are both passed and returned... there are some primitives that directly
> > manipulate things on the stack, such as
> > 
> >   swap: swaps the top two items on the stack
> >   dup : makes a copy of the top item, leaving two identical
> >         things on top of the stack, one beneath the other
> >   pop : destroys the top thing on the stack
> 
> Is that all that is needed?  

No. However, if you have "dup" and "pop" in addition to these
three more primitives below, then you have all the combinators you need:

  i:    takes a program off the stack and executes it
  dip:  takes a program off the stack and executes it, but first
        saves the next stack item, so that the program cannot touch it,
        which is restored after the program completes.
  cons: takes a program off the stack, and a stack item beneath it,
        and yields a new program that is like the original except
        that when it is executed the stack item will be pushed
        first.

There are various other ways of getting completeness using fewer
primitives; Billy and I discussed this a bit a while ago
(see my post "Combinatory Completeness in Joy" and his follow-ups).

One thing that I didn't mention earlier is the technique of pushing
programs onto the stack; in Joy, if "foo" is a program, then "[foo]"
is a program that pushes the program "foo" onto the stack.
Programs are often pushed on the stack for control purposes;
for instance, to run the program "foo" over and over again in a Joy-like
system, one might do

  [foo] loop

although actually this can be done in a weird way without using "loop":

  [[foo] dip dup i] dup i

With this in mind, "loop" itself could be constructed as:

  [dip dup i] cons dup i

Quotations are also often used for purposes of branching. For instance,
one might do

  3 4 <

which is a program that pushes a proposition onto the stack, indicating
whether 3 is less than four; then, if one wanted to branch on this
condition, one could do

  3 4 < ["it is false" put] ["it is true" put] ifte

where "ifte" is a program that takes two alternatives off the stack,
and a proposition beneath them, and executes one of the alternatives,
depending on what the proposition is (and pops the other alternative off).
Exactly how this might be done might vary among Joy-like systems,
but this is the basic idea...

> I would have imagined the need for
> swap(i,j) that swaps the ith and jth element on the stack.  

All such swapping combinators could be constructed from the five
primitives above. Two other series of combinators of significance are
the "dig" and "bury" series of combinators; "dig" is a series of
combinators each of which digs up an item from a certain depth,
and moves it to the top of the stack. "bury" is the opposite of "dig",
and takes an item from the top of the stack, and buries it so many
items deep. Using these "dig" and "bury", your "swap(i,j)" could be
constructed. 

Anyhow, all "dig" and "bury" combinators can be constructed from
"cons" and "dip":

  dig1 == [] cons dip
  dig2 == [] cons cons dip
  dig3 == [] cons cons cons dip
  dig4 == [] cons cons cons cons dip
  ...

  bury1 == [] cons dip
  bury2 == [[] cons dip] cons dip
  bury3 == [[[] cons dip] cons dip] cons dip
  bury4 == [[[[] cons dip] cons dip] cons dip] cons dip
  ...

> Given an
> interface for a function (the values, and order. expected on the stack)
> not just anything could be computed.
> 
> E.G.	f(x,y,z)=x*y+z
> 	push x
> 	push y
> 	push z
> 	f()
> 	
> Notice that f() can not perform this simple operation with the values on
> the stack in that order.  

Whoa... sure it can; such an "f" can be constructed as

  [*] dip +

The "[*] dip" part multiplies the "x" and "y" which are buried under the
"z", and then the "+" parts adds the result with "z". But, the fact
that "dip" has to be used indicates that you probably keeping the
parameters in an unnatural order. The function that might be written as

  f(x,y,z) = z*y + x

would be easier to construct, and could be constructed as just

  * +

In general, it is most natural for parameters to be used in the
reverse order that they were pushed.

> Parameter order is connected to the operation of the function. 

Indeed, in the kind of system I'm thinking of, parameter order
is very important for all kinds of purposes.

> This problem could be solved by inlining almost
> everything, leaving the computer to rearrange the parameters. 

Incidentally, I like the idea of inlining (almost) everything,
but mainly for the purpose that it eliminates a need for
POINTERS, which I very much do not like.

> Alternatively the parameters can be put into an object, and just the
> object reference is the stack, but that is a pset.

That technique doesn't really appeal to me, I guess...
It just sounds complicated.

> I have the same problem with stacks that I do with lists. 

Yes, lists are very similar to stacks...

> Intent can be lost, or artificial intent could be added.  

I guess I just don't see this problem. 
I still like the idea of information being conveyed in the order,
so that names like "Age" and "Spouse" do not have to be used all
the time.

> > You clearly like the technique of using parameterized sets; but,
> > may I point out a pair of drawbacks of the approach:
> > 
> >  1) it necessitates the use of words like "Name", "Age", and "Spouse"
> >     which were not necessary in the original approach (or in the
> >     stack-based approach).
> > 
> >  2) it usually necessitates larger code; for instance, compare the
> >     size of
> > 
> >        a b c MyFunction
> > 
> >     with the size of
> > 
> >        MyFunction(Name=a, Age=b, Spouse=c)
> > 
> >     The parameterized-set approach requires larger code. There may be
> >     exceptions to this when your default values come into play,
> >     though.
>
> The former method is shorter than the latter, but only in human text. 
> Your stack-based system would do the following
> 
> 	push a
> 	push b
> 	push c
> 	MyFunction

Whoa... there is no need for "push". Just do "a b c MyFunction".
This is not just human text; this how one would actually do
it in a programming system. See the Joy system for an example.

> Where psets would be used to populate the function parameters.
> 	
> 	Name=a
> 	Age=b
> 	Spouse=c
> 	MyFunction
> 
> Considering every token as an INT, and appropriate data structures to
> hold each line, we can see that they are the same length for a machine
> to represent.  

They are not the same length. The psets approach requires significantly
larger code, at least for the programmer externally, and probably
also for the computer internally. From the syntax you just gave,
it looks like you'd have to use four lines for the example while
it could be all written in one line in a Joy-like system as
"a b c MyFunction". But, I expect you could use a semicolon or
some other seperator in your system to write it all on one line,
but it would still be significantly larger.

> Anyway, the size of code should be on no concern.   

Whoa... the size of code is of great concern to me.
The smaller the better, provided that other important things are not
sacrificed.

But, perhaps there are things you gain with the psets that compensate
for the larger code.

> > With Joy, the example is
> > reduced to its pure stuff, with fluff like "file", "=", "(", ")",
> > and ";" discarded.
> 
> On one hand you speak as if the source code models implementation, and
> on the other you speak of fluff.  I would like to propose that ALL source
> code is fluff.  

I don't think I buy that philosophy...
Source code seems quite an important part of a system to me.

> Source code should be a clear description of intent, to
> be interpreted and compiled by the machine for optimum performance. 
> What is optimum for a stack machine is not optimum for a fixed register
> processor.

This I can accept, I guess. But, remember that in saying that the
Joy-like approach uses a stack, I do not mean that it need only
run well on stack machines; it could be made to run well on machines
with fixed registers also. Perhaps it would help to think of the stack
as an abstract thing, as you would psets in your system, rather than
as a low level thing, modelling the processor's low-level stack.

> > Hmm... Right now, I don't really share that enthusiasm about psets
> > (I'm not sure exactly what is meant by "table") .
> 
> A table as defined in standard DB theory.  It also goes by the slightly
> misleading name "relation".

Hmm... I guess I know little of DB theory... 

> > Things like them could probably be constructed in a Joy-like system
> > (as non-primitives), but I don't really see the need or use in taking
> > them as primitive.
> 
> List and Stack and table/pset paradigms are equivalent in terms of
> computational ability.  It is a matter of preference.

This is probably true. It's a matter of preference.

One reason one might prefer your pset approach is the independence
of order, so that programmers do not have to remember the order
parameters are taken in.

But, a couple reasons why one might prefer the stack-based approach:

  1) It is closer to the hardware that most machines use, which
     makes it easier to compile efficiently (although the pset
     approach could also be made to run efficiently, if nothing
     else by translating the problem into a stack-based one)

  2) Code size (externally and internally) tends to be smaller.

The second one is most significant to me, although the first is
a bonus. I like the way in which a stack-based system allows one
to write very concise code.

> My main point is that the use of ordered structures to encode meaning
> seems wrong to me. 

Interesting... but, I'd still have to say that it does not seem wrong
to me. Actually, it seems very right to me, as it allows programmers
to express things in smaller ways than they would be able to
otherwise.

Hmm... and what about the case of arithmetic operators like +, -, *, and / ?
It seems like you'd have to make an exception for them and allow
the ordering to be significant, or else the code will be really
bloated... for instance, in your kind of system how would you
write "(2+3) * (4+5)" ? In a Joy-like system, it would be written

  2 3 + 4 5 + *

> ----------------------------------------------------------------------
> Kyle Lahnakoski                                  Arcavia Software Ltd.
> (416) 892-7784                                 http://www.arcavia.com
> 

Thanks for your comments...

- "iepos" (Brent Kerby)