[gclist] reference counting

Boehm, Hans hans_boehm@hp.com
Tue, 5 Sep 2000 16:59:21 -0700

That's an interesting technique.  But I'm not sure how it solves the problem
I had in mind.  Or maybe I didn't express the problem very clearly.

Assume X points to a list consisting of a cons cell, which in turn points to
a list of 1000 1MB objects.  Assume there is no memory left to be allocated
at this point.

I then assign nil to X, and try to allocate a 1MB object.  At that point I
know that the cons cell is available.  But it's too small.  The 1MB objects
still appear referenced, since I haven't yet gotten around to decrementing
their count.  Where do I find a 1MB object to allocate?  Or do I have to
grow the heap?  (You can argue the 1MB objects are still logically
referenced and thus not allocatable, but that doesn't strike me as a natural
model.  But I'm probably missing something else here.)

If I might have to grow the heap in a case like this, I would like to
somehow ensure that if my heap contains N bytes of live data, I never need
more than CN bytes memory, where ideally C is a constant.  (In light of
fragmentation, I'll settle for C = clogN in the worst case, where c is a
constant.  Failing that, I'll settle for any intuition about how much space
I might need.)


> -----Original Message-----
> From: David S. Wise [mailto:dswise@cs.indiana.edu]
> Sent: Tuesday, September 05, 2000 4:12 PM
> To: gclist@iecc.com; hans_boehm@hp.com; jgm@cs.cornell.edu
> Subject: RE: [gclist] reference counting
>     From: "Boehm, Hans" <hans_boehm@hp.com>
>     To: "'Greg Morrisett'" <jgm@cs.cornell.edu>,
> 	    "'gclist@iecc.com'" <gclist@iecc.com>
>     Subject: RE: [gclist] reference counting
>     Date: Tue, 5 Sep 2000 12:40:12 -0700 
>     Jones and Lins suggest avoiding this problem by using an 
> algorithm due to
>     Weizenbaum, which recursively decrements reference counts 
> based on pointers
>     in an inaccessible object only when that object is 
> reallocated.  As far as I
>     can tell from the original paper that technique was only 
> intended for
>     fixed-size objects.  (The paper uses list cells with back 
> pointers.)
> No problem for variably sized objects either,
> though you may not accept the security risk.
> We delay the decrement even further---past reallocation to the point
> where a pointer field is overwritten, but that can leave
> "old" data structure accessible from "new" nodes when your 
> code is sloppy.
> So let's be ideal for a moment, and presume that every allocation
> is provably followed by a write into *each* field before there
> can be a read from *that* particular field.
> (No need to complete all the writes before any read, although 
> that does
> happen a lot simply because constructors tend to reinitialize 
> all the fields.)
> If you accept the proof, then this system becomes secure.
> OK.  Now use the ordinary pointer-assignment protocol [ 
> refCt++; if (--refCT)... ; ]
> on every single write of a pointer to memory, specifically including
> these "delayed" initializations.
> Now there are exactly as many increments and decrements as there are
> writes of a pointer to memory.  And the size of an object 
> becomes irrelevant.
> (In fact, this protocol yet allows a field to go unwritten----full
> of archaic bits--- as long as it is never read.  Thus, a 
> possible space leak,
> so if you want to have security *and* good space use, you want to add
> the constraint that each field of a new node *is* overwritten 
> really soon
> after reallocation.  Only good sense---unless you are into 
> lazy evaluation! :-)
> So mixed-size objects don't matter where the deallocation
> is delayed all the way to synch with the overwrite of each field.
> And *that* is how we moved reference counting *entirely* off
> any "processor" to live in smart memory running at memory speed.
> With *no* increase in address space for the counters.
> Allocation still requires a memory cycle, but think of it as 
> a "prefetch".
> As long as space is logically available (and there are no 
> counted cycles),
> it does not matter whether it is on the top or nested deeply
> in the available pool.  Barring cycles, if space is available
> then there is at least one node at the top of the pool.
> ====
> David S. Wise +1(812)855-4866;  fax: +1(812)855-4829   
> dswise@cs.indiana.edu
> Computer Science Dept., Indiana University, Bloomington, IN  
> 47405-7104, USA
>                     http://www.cs.indiana.edu/~dswise/