types and operators

Nathan Hawkins utsl@one.net
Sat, 15 Jun 1996 10:19:43 -0400 (EDT)


On 14 Jun 1996, Eric W. Biederman wrote:

[lots of stuff deleted]

> > > > 2. Lisp-style cells, a pointer w/tag, and a 32-bit data element
> > > >    Pros:
> > > >      * hl things would be easy to implement
> > > >      * doesn't lose any bits from pointers and ints :-)
> > > >      * don't need to worry much about fragmenting the heap
> > > >      * the stack could be implemented as a list on the heap
> > > >    Cons:
> > > >      * costs twice as much memory for each cell
> > > >      * probably quite inefficient
> > > > 
> > > 
> > > <commentary>
> > > I think those extra 31 unused bits would get very tempting to use for
> > > something.  I can imagine all kinds of hacks we could regret later.
> > > Also I don't quite see how that would bye us much but one bit at the
> > > expense of 31 for implemtation purposes.
> > 
> > I certainly didn't intend to waste them. :-) Specifically, I was thinking 
> > in terms of lists, implemented almost identically to Lisp. Keeping the type
> > tag in the cdr part of the cell would allow us to have 32 bit pointers 
> > and integers on the car side.
> 
> That's the problem.  You will use those.  The code we write, or the
> techniques we chose will not resemble those of Forth.  If our guide
> for the LLL is to be Forth it should be essentially Forth.  Under
> Option 1 we can write a GCing ANS Forth as fast as any other.  Under
> this plan who knows what we will write but it won't be Forth.  Further
> the compiled code will be implemented quite differently.
> 
> My point is we'll get lost in this implementation and never get
> anywhere as Fare was in M4 and assembly.

You're probably right there... I still think it would a very interesting 
idea, but I'll agree to go with option 1 for the initial implementation, 
as you describe it. Sounds good to me.

> > 
> > This would simplify lots of things, since we could implement words as 
> > lists of pointers to code. Stacks could also be implemented as lists, at 
> > some cost on the x86, but such stacks could be grown within the heap without
> > problems. Also, on Risc machines, this wouldn't be much overhead, because 
> > most of them don't seem to have stack instructions...
> 
> This would hardly simplify anything, and trash pointer arithmetic
> completely.  Furtherd on a Risc the worst crime you can commit is not
> keeping your data in registers.

That's true, but complex data structures can hardly be handled well in 
Forth anyway. Maybe a better idea would be to implement a more normal 
system, write the words to implement lists, and then see how well the 
concept works in practice.

> > The point is, the process of garbage collecting such a heap is fairly 
> > well documented. While it is arguably inefficient, it seems possible that 
> > we may neatly avoid a lot of headaches this way. For instance, linked lists 
> > are not seriously affected by being fragmented...
> 
> Fragmentation is a non-issue.  We have garbage collection don't worry about it.
> Also reducing Fragmentation as much as possible is a major issue to
> improve cache performance.

> > > 5. Native code compilation with a type smart compiler.
> > >    Pros:
> > >      * Very efficient.
> > >      * Runs fast.
> > >    Cons:
> > >      * hard to implement
> > >      * compiles slow ( likely faster then C though)
> > > 
> > > <commentary>
> > > We really need to get to some version of 5 in the future.
> > > 
> > > What would be ideal would be to save enough information in word
> > > headers so we can native code compile words after we have Forth
> > > compiled them.
> > 
> > That would be the plan. Presumably we implement threaded code of some 
> > sort first, then write the compiler on that, or in HLL later. (I would 
> > prefer it be written in LLL.)
> > 
> > > </commentary>
> > > 
> > > ------------------------------
> > > The real challenge to get right will be the implementation of the
> > > data space dictionary manipulating words so they work alright in a
> > > garbage collected setting.  The fact from ANS forth that dictionaries
> > > can be broken into many pieces data, code, etc should help a lot.  The
> > > challenge is to tell where garbage collected blocks start and end.
> > 
> > That's one of the reasons I'm favouring using cons cells. That problem 
> > neatly goes away. The garbage collector would only need go through the 
> > heap, and would not need to be aware of the dictionary.
> 
> The Forth heap, is the Forth dictionary, but it's not as general.
> Let's keep it that way.
> 
> As far as , C, ALLOT et al. the dictionary data space words.  They
> simply allow incremental allocation of space of the Forth dictionary.
> Since pointer arithmetic is allowed per ANS for these incrementall
> allocated regions we need to take a little care and make sure we set a
> proper size.  This is not hard, it simply need some carefull
> attention.
> 
> > > One bullet that shouldn't be too hard to bite is pointers into the
> > > middle of objects.  I would simply keep a lookaside table with the
> > > length and location of objects, sorted by location (for binary
> > > search), and stack allocate the things.
> > 
> > Again, this also would be irrelevant where cells are individually equivalent.
> > :-) Only "strings" would need this sort of thing. Probably we'd partition 
> > the heap into two sections, one for variable length objects without 
> > pointers, and the other for cons cells. No lookaside table needed.
> > 
> 
> Let's say only vectors need this sort of thing.  And Vectors turn up
> frequently in normal Forth.  Anyhow I meant it's easy, so we shouldn't
> worry about it.
> 
> -----------------------------
> Finally, a New Thought
> 
> I think that the connection between LLL and HLL should be that they
> share a common internal native code compiler Language.  And can call
> each others routines Natively.  The LLL should be for exposing the
> machine as much as possible while making it easy to implement.  The
> HLLs should be for writing cool aps etc. 

Sure, but if I want to write /my/ cool apps in LLL, that should be fine 
too. Except for that one caveat, I entirely agree with you.

> Eric
> 
> If we can tack a GC onto Forth,  will remaining ANS compliant (At least
> after we load the ANS compatibilty wordsets) we will have acheived
> something I think no other Language has done, and show how truly
> reflexive forth really is.

We could do that. The question that remains is, what do we really want to 
do? I lean a little more towards Forth-83 than ANS, for example. ;-) Some of 
the HLL people might like the Lispish Forth better. (Big surprise that'd be.) 
I was just trying to summarize the options that I had seen or could think 
of, (although I obviously forgot native compiling) so that a consensus 
could be reached. I think this is really the only issue that must be 
settled in order to write some code.

*utsl*