types and operators

Nathan Hawkins utsl@one.net
Sat, 27 Apr 1996 08:08:26 -0400 (EDT)


On Fri, 26 Apr 1996, Francois-Rene Rideau wrote:

> Let's sum up things from my last post:
> 
>    We agree that the GC should primarily know about
> * one kind of pointer, perhaps used to encode small integers, too.
> * two kind of memory blocks, those made of pointers,
>  and those made of raw data, the each block having a known size.
> 
> We can add a lot of refinements later, but these are the most important.
> Strings, buffers, long integers, whatever, will be raw data to the GC.
>    This may mean that mixing pointers and raw data require an additional
> indirection, unless we have mixed blocks, but then, it means additional
> GC code (not too important, but superfluous as for an initial release),
> and an additional size tag (much more inconvenient) for these blocks;
> we could have several kinds of blocks, too.
> 
>    I think we agree on these points. Where there are choices left,
> our rule has always been "implementor chooses". So if you want your
> choice against someone else, you'd better implement it first :)

Ok. Well, I've been fishing for ideas to familiarize myself with how to 
implement this stuff. I'm slowly getting some idea how I'd like to do it. 

> >> The reason I make the above restriction is as follows.  If words are
> >> to use a single stack as C does they can be implemented as allocating
> >> frames upon the stack.  The frame holding the funcions arguments, it's
> >> return locate, the address of the last words frame, the local
> >> variables, a buffer to receive results from functions, and most
> >> important a tag that tells what everthing is in the frame.
> 
> What you may gain at doing that, is that the implicit tagging of
> frames with the return address would automatically give the size
> of the frame. It might be an interesting thing to try, but to me,
> much too complicated to be worth implementing initially,
> and I'm not sure it is worth the complication afterwards.
> Could not even be used to GC C code, as it works for the arguments,
> but not for the local variables.

Would someone define stack frame for me?
I think of stack frames as what C does to the stack. You only need that if 
you have a single stack. But I was thinking we'd have two stacks per 
thread. The GC will only need to scan through these stacks for pointers. 
It will need to run on a special stack of its own I'd imagine, but that's 
no problem. (That's assuming we can do tell integers and pointers apart 
on the stack.)

> > Well, in Forth, we don't use local variables, anyway.
> ANS FORTH has at least two standard ways to do local variables,
> and non-ANS FORTH's have many more. Surely they are not overused, though.

Sorry. I was caught in slight time-warp there. I tend to think Forth-83 
more than ANS, since I started getting up to date on Forth only a couple 
months ago. But I've always tended to use all kinds of variables
sparingly in Forth, preferring to do things on the stack. :)
But you're right, they do have local variable packages now, but I have no 
idea how much people use them.

> >> Things only need to be orderly to present to the garbage collector
> >> when a function calls allocates memory from the heap or calls some
> >> function that might.  So in general at function call time.
> >
> > Maybe not even that often. We could probably get by with having the GC
> > run only at programmed points, which would allow us to simply make sure
> > that nothing significant is on the stack at those points. This is
> > consistant with the cooperative multitasking model Fare has spoken of.
> Yes. Any function that allocates a predictable amount of stack space
> could have this amount preallocated by the calling function,
> so that calling functions could be required to do so
> (remember: we're talking about LLL stuff; most non-optimized HLL stuff
> would indeed be completely framed to allow GC).
> Anyway, if we require the stack to be made of proper GC-able objects,
> framing is just a matter of "closing" the stack with its current length,
> and can be done automatically by the allocation routine that detects the
> need for a GC.

"closing" meaning storing the address and current length, then switching 
to a safe stack? That's what I was assuming we'd do. GC-able objects
means it can detect the difference between a pointer and an integer, right?
(The terminology you use confuses me sometimes.)

> >> I hope this is a framework you can live with, as better eludes me.
> > I think I'd rather put up with type munging. :( :( :(
> I think we *must* give types when there are,
> but not reject code when there aint a simple one.
> After all, that's the LLL,
> and there will be more than enough types in HLL.
>    Surely, if we can have both universally and existentially quantified types
> that depend on integers, and accept cases,
> all FORTH primitives can eventually be well-typed. But that's not a priority.
>    Now for utsl, remember that any frame that might be GC'ed must be clearly
> tagged to let the GC know things (notably size, etc). Because frames are
> already tagged with their execution address (well, at least the frame that
> pointed to them, and from which the GC reached them had such information),
> it might be an "optimization" to derive size information from execution
> address which this also means having a way to distinguish frames from
> normal blocks; but then, there is no absolute requirement from the derived
> size to depend *only* on the execution address; the derived size code could
> very well pick size info from the frame. Again, this really seems
> overly complicated. I'm all for having execution frames mostly standard
> frames.
>    Please also remember that this complication is not valid for
> code between GC, whose results are transient, and that will not last
> long enough for a safepoint to be required by our conventions.
> Only calls to "PAUSE" should be specially
> framed/framable, but then, this preclude growing the frame down afterwards,
> unless we copy the frame to the end of heap before,
> which can be all the more interesting for continuations etc.
> [Remember that I'd like all non-interrupt-driven tasks to share the same
> stack/heap/whatever space].

I think I understood that last paragraph, could you restate the other 
parts, defining some of your terms? I'm particularly confused by what you 
mean by a stack frame.

Re: sharing stack/heap/whatever space between all [...] tasks. Having a 
common address space is no problem if that's what you mean. Neither is 
sharing a heap, assuming that allocations are not interruptable. But how 
can you share a stack? This is impossible, as I understand it. Please 
enlighten me. (Unless you mean something other than using the native 
PUSH/POP instructions, and do something more complicated.)

>    As a side note, we still haven't decided how "return addresses"
> would be encoded. Will we require block-aligned pointers, and have
> "base+offset" encoding ? Will we allow "infix" code, and go through
> the hassle of identifying the beginning of a block each time ?
> The first may be the simplest to implement, and though it is inefficient
> at run-time, we don't care for the initial implementation. Let's do it
> and wrap it with enough macros so that the user doesn't see it.

Well, I would answer that with the following question: how hard would it 
be to indentify the beginning of a block each time?  Especially since the 
allocation routines could be coded keep track of blocks. For example, we 
might set up an ordered array of pointers, which can then be searched for 
the largest pointer less than the one we're looking for. This should 
identify the block, and could be done with a bsearch.

Infix pointers would be somewhat more trouble for the GC, but drasticly 
more efficient and convenient to use. Particularly when you consider that 
we'd have them on the return stack, and would have to convert the IP into 
a base+offset.

>    For a longer-term implementation, I propose the latter trick
> (allow infix pointers), with some page-based hashing:
> Objects on a same "page" will be required to be of same "GC-type",
> so by doing simple stuff with the high-bits of a pointer
> we can determine the action to take to see the beginning of a block
> by checking a page directory.

Hmm. Might as well call that a multiple heap implementation.
We could also just put at the beginning of the object, and have the type 
and size of the object stored there.

> The exact implementation should priviledge usual cases,
> testing them first, and add levels of indirection for unusual cases.
>    Objects of large size (several pages) would have their own entry.
> Small objects would be grouped on pages where all objects have the same size.

Buddy system? Never really like that one much...

> Page boundaries would be respected as much as can be to avoid multi-page
> entries. Objects of uncommon sizes would be found by following the
> linked or binary list of objects on the page (linked might be better
> if there are very few objects).
>    Most source and/or destination regions could require exact pointers,
> however, which would simplify things. Code for infix pointers would be
> called only when we're not sure. This would greatly speed up GC.

A better method would be to say that for each infix pointer, once we find 
the object referred to, we can determine the type of object from a flag 
stored at the beginning of the object itself. This would be fairly 
trivial to do, if we can do a decent algorithm for quickly discovering 
the block from an infix pointer. How fast can we do this?

> > You see, I'd like
> > to remain as close to Forth in design as is possible, but with the
> > needed extentions. (As well as dumping some minor stupidity in the Forth 
> > I/O words.) But I do want it be close enough that most ANS Forth would 
> > run, provided it obeys the rules of ANS Forth.
> Well, we should surely have ANS compatibility packages eventually,
> but direct compatibility is not an issue.

Well, I was hoping to keep the general nature of the language, for 
instance the RPN notation, dual stacks, : style definitions, some kind of 
threading, etc. I'd like to keep the basic operations as much as 
possible, particularly the stack operators, and stack behaviour.

So ANS Forth compatibility could just be a small vocabulary, mapping ANS 
Forth words into LLL words.

The real issue is the language model: I like the Forth virtual machine, 
especially the data/return stack part. This allows drasticly more freedom
for variable arguments and return values. Functions like printf, which 
are a real pain in C, become trivial, and we can even do similar things 
with return values. So the equivalent of scanf might be able to return 
multiple values, instead of needing to have pointers passed to it.

Stack frames break this model, AFAIK.

> > In general, I'd like to stay away from stack frames.
> Unhappily, they are just *required*, not everywhere, but at checkpoints.
> Maybe the easiest implementation is to require all data on stack to be
> of valid "pointer/integer" type at checkpoints, so that framing is only
> knowing the stack length. But even if we do that,
> I prefer we insert code to allow more constraining implementations,
> if we want to try that later.

Please explain. I was assuming that all that would be needed to work on a 
stack would be to have that annoying type distinction and a pointer/size
to a frozen (i.e. not in use) stack.

What do you mean by inserting code for more constraining implementations?
Could you give me an example of what would be inserted where?

> > Those are just my preferences, though.
> > We'll have to see what other people think.
> My idea seems to satisfy everyone,
> but still impose hard constraints on the LLL programmer.

The main constraint I see is the integer/pointer distinction being a 
nuisense. This is mainly a constraint on the primitives, and a major 
headache for the programmer who writes them.

> BTW, what grain should we require for PAUSEs ?

I would leave that an open question, or variable, since we don't know how 
often we will need it. Also, it might be nice to be able to let the HLL 
decide this for it's code, and let LLL programmers hand code this in. :-)

> We agreed that LLL code should PAUSE at least every 2^-n second,
> Where n should preferrably be one more than for the hardware timer.
> What should be n ? I think 11 to 16 is good.
> What can the PC RTC do already ?

Well, I can't answer the rest, but as for the PC RTC, it can generate 2^16 
interrupts per second, at ruinous cost on a protected mode system.

I'd like to have a model that uses a combination of preemptive and 
cooperative tasking, in multiple heaps. So threads running in heap 1 can 
be preemptively multitasked with threads in heap 2, but cooperatively 
with respect to each other. Then, let GC be done only locally to a given 
heap. This would allow GC to be done with considerably less urgency, 
since we don't halt all threads when we do it.

The only headache this imposes is with cross-heap pointers. The best 
solution to this is to have those be indirected through a pointer array, 
and require that these be used safely, in such a way that the GC may move 
objects and update the indirected cross-heap pointer, without changing 
the other heap.

In C, this would be something like this:

int **foo;

foo is a cross-heap pointer to some int
to move the int that foo points to, we must change *foo. but foo itself, 
doesn't get changed. We only must require that the owner of foo may not 
keep a copy of *foo on the stack, or in a register, during this process.


Please excuse the redundancy and rambling in this message. This whole 
topic is getting very confusing to me.

*utsl*