Memory model for the LLL

Francois-Rene Rideau rideau@clipper
Fri, 9 Dec 94 21:35:46 MET


>> Will we allow infix pointers (not pointing to the beginning of an
>> object) ?
> Care to explain for those of us non-pointer-masters?

Well remember we'll have to support garbage collection and object migration.
The only way to allow garbage collection with objects of any size and not
ending with swiss cheese memory that won't allow allocation even though
total free memory is big enough (which means the premature death of the system
-- very bad for a persistent system like ours), is to allow moving of objects
inside memory, which also means updating all the pointers to the moved object.

* The easiest solution is to have only handles, not pointers, and every access
would go through a handle. Ugh. that hurts performance a lot.
* then, you can build double-directed pointers: each pointer has a reverse
pointer, so it can be updated ! This at least quadruples pointer size
(because you need maintain pointer lists or trees to update them). This may
be particularly interesting to allow several kind of pointer strength:
a pointer hooks an object to reality, but if it is not strong enough, it may
break, and the object be lost, when some external strength pulls it to hard:
this way, lazy de-evaluation can take place...
* you can algorithmically compute needed object and update pointers when
need using well known garbage collection methods: when the heap gets past
some threshold, you launch a garbage collection cycle. Starting from the
"root" (some pointer that recursively points to all needed/active objects),
you algorithmically walk all the structure of pointers, updating them at
the same time as you move objects. Note that you still need some big
enough stack outside the heap (no problem with virtual memory).

The last method is particularly useful if GC the use ratio is small (few
allocated objects are useful), that is, if much time has ellapsed since
previous GC, as measured with allocation/deallocation time as unit, which
is true when using some pure programming style (with as few side-effects as
possible).

So infix pointers mean you can point anywhere, even inside an object.
Imagine some {a,b,c} structure, or some char array. Without infix pointers,
you would have to always point to the beginning of the structure, and use
deconstructors to access elements. With infix pointers, you could point
directly to the nth element. The advantage of infix pointer is that
sub-elements are directly manipulated, which may be particularly useful
inside code, that would point directly to a routine from another module
without going through multi-indirection levels (but code and data could be
isolated so that different GC handling may be used ?). Only infix pointers
allow pointer arithmetics; they may be emulated by usual pointers, using
a pointer+offset pair (that's how they could build a "C" compiler for a LISP
machine target). The disadvantage of infix pointers is that the GC algorithm
is made much much (tens, hundreds, thousands of times) more complicated: before
you can update a pointer, you must retrieve the object it comes from, which
is trivial when the pointer is at the beginning of the object, but not anymore
if the pointer can be anywhere inside the object (not to talk about pointers
after or before the object :).


>> What will be our integer and pointer size(s) ?
>
> Well, it would seem desirable to have it relate to the underlying 
> hardware.  This might create inconsistancies, though.  How about we have 
> a minimum size for each, but anything bigger is ok?
Nope. remember we have to ensure compatibility up to migration. When you
migrate an object, you must be sure that it will fit the target's size; hence
the (useful) size of an integer be defined independently of the architecture
(but different integers *may* have different size, as long as you have some
means to differentiate them).




>> Implementation ideas:
>> * Any implementation would use native pointers as values,
>> with lower or higher bit(s) to indicate the value being pointer
>> or integer, thus losing one bit. For ;ore portability, integers
>> would come with size 8, 16, 32, or 64, even though the architecture
>> be 20 bit.
> 
> Could I ask why we are differentiating between pointers and integers?  If 
> we're doing this, why not make floating point types, and maybe a bunch of 
> others?  Isn't this complicating the issue?
> 
> Also, how/why is this 20 bit?  Sounds like a strange choice....
Know the MuP21 ? Some great MISC (minimal instruction set) chip, that
produces 100 native bogomips at 20 MHz, and produces some ridiculous amount
of heat, 'cause it's using standard VLSI techniques...


>> * GC would happen only during cooperative multithreading, while
>> preemptive multithreading is allowed for real-time behavior on
>> preallocated (outside global GC) zones.
> 
> Repeat after me, "Cooperative multitasking is evil.  Cooperative 
> multitasking is evil...."  =)  Seriously, though, cooperative 
> multitasking really kills my new reliability goal.  Agents should not be 
> allowed to eat the whole CPU.
Not a problem if we have programs *proven* to work well, and/or generated
automatically by a compiler that produces correct code. Only proven code is
accepted in native mode. Emulation boxes may use any kind of unreliable code,
as long as they are in isolated, preemptible, boxes.


>> * object type is determined by which page it is located on and/or
>> which tag precedes it.
> 
> Type?  Tag?  Care to explain why we have to have a type system, and what 
> types would be involved?  I feel the extendability goal perhaps getting 
> treaded on?
"Type" here means just relative to GC and migration behavior, which is the
minimum we need for the system to work. Of course, we *may* *allow* more
information to be associated to objects at system level. I dunno. This should
be studied. I think this could be implemented independently from the garbage
collection system, but perhaps putting it at the same place could be handy...

--    ,        	                                ,           _ v    ~  ^  --
-- Fare -- rideau@clipper.ens.fr -- Francois-Rene Rideau -- +)ang-Vu Ban --
--                                      '                   / .          --
MOOSE project member. OSL developper.                     |   |   /
Dreams about The Universal (Distributed) Database.       --- --- //
Snail mail: 6, rue Augustin Thierry 75019 PARIS FRANCE   /|\ /|\ //
Phone: 033 1 42026735                                    /|\ /|\ /