Joy, Dolphin, Memory Management (was: Joy, FORTH, ...)
Mon, 27 Mar 2000 12:42:09 -0800 (PST)

> >Hmm... maybe a quote from the manual will help
> >(from "Atomic Programs of Joy"):
> >  A value of the list type is just a special case of a quotation in
> >  which the elements are themselves literals.
> I see what you mean.  Yes, this is an oddity of Joy.  (And it causes me some
> confusion -- do you know what Joy does when you type "[+ 3] uncons"?)  

Hmm... well, there's one way to find out...

A test shows that you'll wind up with two things on the stack,
"[3]" and "+"; this is confusing... It doesn't make sense that "+"
would be on the stack; there doesn't seem to be any other way to get
it there; I would have guessed that it would have
executed "+", expecting two more numbers on the stack.

Anyhow, I try to avoid "uncons"; to me, it seems similar to "undo".

> The fact that quotations can be manipulated as lists isn't too hard
> to handle; the problem is that lists are sometimes actually
> quotations (as above), NOT true lists.

Indeed, "[+ 3]" does not fall into the "list type" as the author of Joy
defined it; this is because the elements of the quotation are not all
literals; specifically, "+" is not a literal. To be a literal, it
would have to be a primitive program that has the sole effect of
pushing one thing onto the stack, but "+" is not this way...

> I'm currently compiling directly to the dictionary; I never write to the
> disk.  It looks like the result will go into Dolphin as part of the kernal
> utilities (and possibly as part of the kernel itself, the VM), so I'll never
> write code deliberately to the disk (although the user may do this if they
> wish).

Ahh... looking at Dolphin, I guess you're writing this in C and
maybe some assembly. By the way, in C, if you know you're dealing with an
x86 machine, is it safe to do something like this? :


where "genprogram" is some program that allocates and fills an array
of bytes with x86 code and returns it (cast to a void function pointer)...
I guess if it isn't safe, then you could just use some assembly to
interface with your code generator...

> Yes, it's x86, which makes the full register optimization a lot of pain for
> very little gain.  Fortunately, the guy who's doing the optimizer claims
> that it's very easy -- he's produced three versions in four days.


> I plan to implement quotations as arrays of pointers to word-structures (the
> structures contain the name, code, and optimization info).  Anything that's
> not quoted gets compiled; compilation inlines certain primitives and always
> generates a "CALL" instruction for everything else.  The stack is cached in
> as many registers as possible (either that, or I'll cache TOS only in AX and
> not cache anything else -- I have to compare).  Tail calls are converted to
> jumps whenever possible.

Interesting... One question... how are you planning on implementing lists?
I'm guessing you're going to supply special primitives for lists,
and distinguish them from quoted programs (unlike Joy)... But then,
the question is: are you going to keep the lists all in one piece,
or are you going to split them up, and link the pieces with pointers?
This seems like quite an important thing, as lists I'm guessing
will play quite an important role in the system, and their efficiency
will largely determine the efficiency of the whole system...

To me, it seems like a bad idea to me to always implement lists as
"linked lists", whereby one wastes at least half the memory and
makes random access very slow. In particular, imagine writing a
graphics engine for a game in such a system...

Yet the opposing approach, always keeping lists in one piece, also
seems to have its problems. In particular, inserting or deleting things in
such a list would be very expensive, as it would require a large memmove
one way or the other. I guess there are other options for lists, too...
for instance, one could use a binary tree, breaking the lists
into several pieces; with that method, you might be able to get
decent times for random access (especially if the leaf size is large)
and also decent times for insertion and deletion (especially if the
leaf size is small)... 

Another question is: will lists be kept directly on the stack, or
will only pointers to them (off in the heap somewhere) be kept?
The more I think about it, the more I like the idea of just keeping
everything directly on the stack; of course, a problem that arises
if one does this literally is that it becomes very expensive to
shuffle things around on the stack... I'm not sure how serious this
might be.

One solution to this might be to use some addressing scheme whereby
one could move things around without really moving them; one would
have a table, probably in the form of a binary tree, that maps
"virtual" addresses to physical ones. One would probably often cache the
base addresses in registers, such as "ebx" or "ebp", and take advantage
of the processor's features for memory indexing... Paging might be
used for this, although I'm guessing that pages would turn
out to be far too coarse-grained for most purposes... Segments might
be used, though, although there is a bad overhead in modifying
the base of a segment register due to the strange nature of x86
in which segment registers do not directly contain the base but
instead have to be indirectly specified in the
"Global Descriptor Table"... What really blows me away is that the
base has to be given in three different pieces in the descriptor; I
have no idea why they did it that way...

Also, it is interesting that you think of using "CALL"... I'm inclined
to always inline everything, but I guess this might not be practical
in some cases... but, recursion would at least would not be a special
problem, as it can always be replaced by non-recursive things...
If there are some large programs that are used so many times that
for storage (and caching) purposes one does not want to inline them,
then this seems like a special case of a more general problem
of things being copied so much that one wants to use copying-by-reference
rather than physical copying... if possible, it should be handled
by the same mechanism...

Anyhow, these are all rather raw thoughts... Probably the best way
to figure out the best way is to experiment...

There is one optimization that I've been thinking of that I think
might be quite important... Often before using a procedure, one
makes copies of the parameters if one still needs them, so that they
are not permanently gobbled up by the procedure... For instance,
if one has a big buffer of pixels on the stack (in the form of a list),
and one is getting ready to flush them to the screen using "refresh",
but one still wants the buffer of pixels, then one might do:

  [refresh] sip

or perhaps

  dup refresh

which is equivalent in this case, since "refresh" only uses one parameter.
Now, the point I wanted to get at is that it would be silly if the
system implemented this by physically making a copy of the pixels;
this is because "refresh" probably does not really need to write over
the pixels; it probably just reads them, incrementing the stack pointer
as it goes along. The pixels could probably be restored at the end
just by backing up the stack pointer. However, some care might need
to be taken to make this work; "refresh" might naturally use some of
the stack as a scratch space, effectively destroying previous pixels...
To make the technique work, one would have to make "refresh" use
somewhere else as scratch space (probably it would just use the original
top of the stack, behind the pixel data)...

A similar technique might be used to efficiently implement something
like this:

  [foo dup i] dup i

This should run "foo" over and over again... Yet, implemented in a
naive way, it will waste time making copies of "[foo dup i]", when
it could just hold onto it; this is because "i" is probably naturally
non-destructive; sort of like "refresh", it doesn't really need
to destroy the program it executes. Hopefully, in a real system,
the above program would be implemented using a backward jump, and
not any copying.

Anyhow, precisely how this kind of optimization might be carried out
is not quite clear to me...

> Except for the stack optimization and quotation, this is straight out of the
> book for Machine Forth.  (  The quotation
> unmistakably fits with the spirit of machine Forth; the optimization would
> fit only if it were a user word and not a default action (I have to think
> about that).

One other thing... it seems like you are thinking of keeping quotations
in uncompiled form at runtime (with a compiled form attatched to it,
perhaps)... It seems possible to me that one could always keep quotations
in compiled form, if one is willing to give up "uncons" and the other
introspecting primitives... The essential primitive then for forming
new programs at runtime is "cons", which takes a base program "p", and
another thing off the stack "x", and yields a modified program that is
like the base program except that when run, the "x" is pushed before
"p" is run... It seems conceivable that "cons" could be implemented
dealing only with compiled programs, although if one was going to
use such a constructed program many many times, then it would pay
off to keep an uncompiled version available so one could optimize it;
however, if one was going to use it, say, one time, then it would
pay off not to waste time optimizing it.

> >but there is something lame about doing it that way... Anyway,
> >I just realized that I'm talking about these variable-stack words
> >now, which you really dislike, so I'd better stop, I guess :-)
> No, all the words you implemented here are constant stack effect.

Oh, that's true... But, if I gone one step further and mentioned a
"dign" or "buryn" that took an integer off the stack telling how
deep to go, _then_ I would have been talking about words with
variable stack effects :-)

> >- "iepos" (Brent Kerby)
> -Billy

- "iepos" (Brent Kerby)