types and operators
Nathan Hawkins
utsl@one.net
Sat, 15 Jun 1996 10:19:43 -0400 (EDT)
On 14 Jun 1996, Eric W. Biederman wrote:
[lots of stuff deleted]
> > > > 2. Lisp-style cells, a pointer w/tag, and a 32-bit data element
> > > > Pros:
> > > > * hl things would be easy to implement
> > > > * doesn't lose any bits from pointers and ints :-)
> > > > * don't need to worry much about fragmenting the heap
> > > > * the stack could be implemented as a list on the heap
> > > > Cons:
> > > > * costs twice as much memory for each cell
> > > > * probably quite inefficient
> > > >
> > >
> > > <commentary>
> > > I think those extra 31 unused bits would get very tempting to use for
> > > something. I can imagine all kinds of hacks we could regret later.
> > > Also I don't quite see how that would bye us much but one bit at the
> > > expense of 31 for implemtation purposes.
> >
> > I certainly didn't intend to waste them. :-) Specifically, I was thinking
> > in terms of lists, implemented almost identically to Lisp. Keeping the type
> > tag in the cdr part of the cell would allow us to have 32 bit pointers
> > and integers on the car side.
>
> That's the problem. You will use those. The code we write, or the
> techniques we chose will not resemble those of Forth. If our guide
> for the LLL is to be Forth it should be essentially Forth. Under
> Option 1 we can write a GCing ANS Forth as fast as any other. Under
> this plan who knows what we will write but it won't be Forth. Further
> the compiled code will be implemented quite differently.
>
> My point is we'll get lost in this implementation and never get
> anywhere as Fare was in M4 and assembly.
You're probably right there... I still think it would a very interesting
idea, but I'll agree to go with option 1 for the initial implementation,
as you describe it. Sounds good to me.
> >
> > This would simplify lots of things, since we could implement words as
> > lists of pointers to code. Stacks could also be implemented as lists, at
> > some cost on the x86, but such stacks could be grown within the heap without
> > problems. Also, on Risc machines, this wouldn't be much overhead, because
> > most of them don't seem to have stack instructions...
>
> This would hardly simplify anything, and trash pointer arithmetic
> completely. Furtherd on a Risc the worst crime you can commit is not
> keeping your data in registers.
That's true, but complex data structures can hardly be handled well in
Forth anyway. Maybe a better idea would be to implement a more normal
system, write the words to implement lists, and then see how well the
concept works in practice.
> > The point is, the process of garbage collecting such a heap is fairly
> > well documented. While it is arguably inefficient, it seems possible that
> > we may neatly avoid a lot of headaches this way. For instance, linked lists
> > are not seriously affected by being fragmented...
>
> Fragmentation is a non-issue. We have garbage collection don't worry about it.
> Also reducing Fragmentation as much as possible is a major issue to
> improve cache performance.
> > > 5. Native code compilation with a type smart compiler.
> > > Pros:
> > > * Very efficient.
> > > * Runs fast.
> > > Cons:
> > > * hard to implement
> > > * compiles slow ( likely faster then C though)
> > >
> > > <commentary>
> > > We really need to get to some version of 5 in the future.
> > >
> > > What would be ideal would be to save enough information in word
> > > headers so we can native code compile words after we have Forth
> > > compiled them.
> >
> > That would be the plan. Presumably we implement threaded code of some
> > sort first, then write the compiler on that, or in HLL later. (I would
> > prefer it be written in LLL.)
> >
> > > </commentary>
> > >
> > > ------------------------------
> > > The real challenge to get right will be the implementation of the
> > > data space dictionary manipulating words so they work alright in a
> > > garbage collected setting. The fact from ANS forth that dictionaries
> > > can be broken into many pieces data, code, etc should help a lot. The
> > > challenge is to tell where garbage collected blocks start and end.
> >
> > That's one of the reasons I'm favouring using cons cells. That problem
> > neatly goes away. The garbage collector would only need go through the
> > heap, and would not need to be aware of the dictionary.
>
> The Forth heap, is the Forth dictionary, but it's not as general.
> Let's keep it that way.
>
> As far as , C, ALLOT et al. the dictionary data space words. They
> simply allow incremental allocation of space of the Forth dictionary.
> Since pointer arithmetic is allowed per ANS for these incrementall
> allocated regions we need to take a little care and make sure we set a
> proper size. This is not hard, it simply need some carefull
> attention.
>
> > > One bullet that shouldn't be too hard to bite is pointers into the
> > > middle of objects. I would simply keep a lookaside table with the
> > > length and location of objects, sorted by location (for binary
> > > search), and stack allocate the things.
> >
> > Again, this also would be irrelevant where cells are individually equivalent.
> > :-) Only "strings" would need this sort of thing. Probably we'd partition
> > the heap into two sections, one for variable length objects without
> > pointers, and the other for cons cells. No lookaside table needed.
> >
>
> Let's say only vectors need this sort of thing. And Vectors turn up
> frequently in normal Forth. Anyhow I meant it's easy, so we shouldn't
> worry about it.
>
> -----------------------------
> Finally, a New Thought
>
> I think that the connection between LLL and HLL should be that they
> share a common internal native code compiler Language. And can call
> each others routines Natively. The LLL should be for exposing the
> machine as much as possible while making it easy to implement. The
> HLLs should be for writing cool aps etc.
Sure, but if I want to write /my/ cool apps in LLL, that should be fine
too. Except for that one caveat, I entirely agree with you.
> Eric
>
> If we can tack a GC onto Forth, will remaining ANS compliant (At least
> after we load the ANS compatibilty wordsets) we will have acheived
> something I think no other Language has done, and show how truly
> reflexive forth really is.
We could do that. The question that remains is, what do we really want to
do? I lean a little more towards Forth-83 than ANS, for example. ;-) Some of
the HLL people might like the Lispish Forth better. (Big surprise that'd be.)
I was just trying to summarize the options that I had seen or could think
of, (although I obviously forgot native compiling) so that a consensus
could be reached. I think this is really the only issue that must be
settled in order to write some code.
*utsl*