[gclist] reference counting

David S. Wise dswise@cs.indiana.edu
Tue, 5 Sep 2000 18:12:29 -0500 (EST)


    From: "Boehm, Hans" <hans_boehm@hp.com>
    To: "'Greg Morrisett'" <jgm@cs.cornell.edu>,
	    "'gclist@iecc.com'" <gclist@iecc.com>
    Subject: RE: [gclist] reference counting
    Date: Tue, 5 Sep 2000 12:40:12 -0700 

    Jones and Lins suggest avoiding this problem by using an algorithm due to
    Weizenbaum, which recursively decrements reference counts based on pointers
    in an inaccessible object only when that object is reallocated.  As far as I
    can tell from the original paper that technique was only intended for
    fixed-size objects.  (The paper uses list cells with back pointers.)

No problem for variably sized objects either,
though you may not accept the security risk.
We delay the decrement even further---past reallocation to the point
where a pointer field is overwritten, but that can leave
"old" data structure accessible from "new" nodes when your code is sloppy.

So let's be ideal for a moment, and presume that every allocation
is provably followed by a write into *each* field before there
can be a read from *that* particular field.
(No need to complete all the writes before any read, although that does
happen a lot simply because constructors tend to reinitialize all the fields.)
If you accept the proof, then this system becomes secure.

OK.  Now use the ordinary pointer-assignment protocol [ refCt++; if (--refCT)... ; ]
on every single write of a pointer to memory, specifically including
these "delayed" initializations.
Now there are exactly as many increments and decrements as there are
writes of a pointer to memory.  And the size of an object becomes irrelevant.

(In fact, this protocol yet allows a field to go unwritten----full
of archaic bits--- as long as it is never read.  Thus, a possible space leak,
so if you want to have security *and* good space use, you want to add
the constraint that each field of a new node *is* overwritten really soon
after reallocation.  Only good sense---unless you are into lazy evaluation! :-)

So mixed-size objects don't matter where the deallocation
is delayed all the way to synch with the overwrite of each field.
And *that* is how we moved reference counting *entirely* off
any "processor" to live in smart memory running at memory speed.
With *no* increase in address space for the counters.
Allocation still requires a memory cycle, but think of it as a "prefetch".

As long as space is logically available (and there are no counted cycles),
it does not matter whether it is on the top or nested deeply
in the available pool.  Barring cycles, if space is available
then there is at least one node at the top of the pool.

====
David S. Wise +1(812)855-4866;  fax: +1(812)855-4829   dswise@cs.indiana.edu
Computer Science Dept., Indiana University, Bloomington, IN  47405-7104, USA
                    http://www.cs.indiana.edu/~dswise/