GC

Nathan Hawkins utsl@one.net
Thu, 4 Apr 1996 17:21:02 -0500 (EST)



On Thu, 4 Apr 1996, Eric W. Biederman wrote:
> I am so overloaded right now that I really shouldn't volunteer for
> anything, but I find garbage collection intrinsically fun.  So I herby
> volunteer to help with the garbage collector for the Forth/LLL thing
> you are working on now when a base of code is written.

How much code base do you want to see? I'm working on adapting some 
partially written Forth, which should be done by later this evening or 
maybe tomorrow. (The adapting part.) But it doesn't have much in it, just 
the basic arithmatic, logic, etc.  Not even the compiling words.
My trouble is, where to go from here? Do I try and implement a normal 
heap, so I can get the command line and compiler up, or do I try to 
figure out how to do a garbage collector with hand compiled words?

> I while back I began a port of RPL (I may get back to it too) and did
> a lot of research into garbage collection. So while I'm not an expert
> I know what I'm talking about. 

You're ahead of me there. Got any good references? URL's, code to read, etc.?

> The only requirement I wish to impose upon code is that pointer in
> garbage collected structures be word aligned (this may be relaxed),
> and the structures always be typesafe. None of the use pointer as
> interger tangle if you please.  Type tagged structures are alright but
> not required. 

Dword aligning would be normal. I'll have to do something about strings...
Still working on ideas for integers. That part doesn't come easy for me.
"Type tagged"???

> Please note that relocating and migrating structures is just as
> migrating code and possibly more so.  My intention is to use a
> possibly beefed up form of relocation/migration information to allow
> the garbage collector to be able to do it's work.  

In a Forth-like system (LLL) it will make sense to treat code and data as 
being more or less the same sorts of things. It looks like there will be 
relocation information in the compiled words. Probably using an indirect 
pointer arrangement, rather than trying to patch code. (At least, that's 
how I'm going to try to do it, we'll see what actually works.)

> Regarding return value of functions in threaded code, that is to be
> garbabe collected.  Threaded code is very much a special case for a
> garbage collector, and should be treated as such.  There is the
> interesting case that threaded code could be partially collected.
> That is if nothing references the top of the piece of code there is no
> reason to preserve it.  The other option is for a traversal/collection
> order that garantees that if there is a pointer to the top of a
> threaded code chunk it will be found before the chunk is moved.  A
> third alternative is to split up and recombind threaded in possibly
> different ways as it travels through.

> I would propose the same techniques for compiled code but that has
> the obvious difficulty that I couldn't get my relocation information.
> That one bears a little more thought.

Well, since this won't be like C compiler output, and I/we'll have 
control over what comes out, maybe we can rig it to output usable 
relocation records for each chunk of compiled code. Hopefully, there'll 
be a specific structure to a compiled word, with a short header, followed 
by code, and then reloc info, with an offset in the header to the 
relocations.

> Note except for reducing memory fragmentation there is not a good
> reason for a garbage collector to have any knowledge of the code or
> the type of structures implemented, or have a need to move objects.
> While this is theoretically true and the basis of all C garbage
> collectors, I want to steer away from this if possible in tunes.

Best if the structures are written to cooperate with the GC. We'll need 
to try and get best results for both the structures and the GC's 
speed/efficiency.

> With regards to using sp as the code pointer.  It actually sounds like
> a decent idea (assuming you're not using the same stack for
> interrupts)  All that would be required would be a special threaded
> return instruction that would read the real return address off of
> you're real stack.

It's interesting. I'm not sold on it yet. And I'm not sure Fare was 
meant having a "real" stack. At least, not as we know it, eh?

*utsl*