[gclist] ref-counting performance cost

Jerrold Leichter jerrold.leichter@smarts.com
Tue, 5 Sep 2000 11:11:14 -0400 (EDT)


| | But this allocation cost would also be paid in a non-compacting 
| | or conservative mark-and-sweep, not just reference counting
| | (i.e., it is not a unique problem to RC unlike the problems
| | of space, time and cycles).
| 
| Indeed, that is true.   But it's a problem that cannot be fixed for RC,
| whereas it is relatively easy to adapt the mark/sweep story to do
| compaction.   (Mind you, compaction involves changing pointers, which
| is hard in a concurrent setting.)

I'm not sure that's necessarily true.  It is true that the combination of RC
with compaction seems never to have been studied much.  Perhaps some tricks
are available.

For example, suppose we use the reference count word to store one of two
things:

	1.  If the reference count is 1, it is a back-pointer to the single
		reference to this object;
	2.  If the reference count is >1, it is the actual count.

Since addresses on all modern machines are even - in fact, even the bottom
couple of bits are all known to be 0 - one can distinguish these two cases
"for free":  To store a count of k, put 2k+1 in the count word.  In practice,
this just means initializing to 1 and incrementing or decrementing by 2.  (If
an object that once had multiple pointers to it and now has only one, there's
obviously no way to reconstruct the back-pointer - so it is possible to have
an actual count of 1 - word contains 3.  Note that compiler assistance here
would be important:  At the moment an object is linked into a list, there are
certainly two pointers to it, one from some local variable, one from the new
list.  But the local variable pointer is probably dead - or can be made dead
by suitable hoisting of initialization code for the object above the point
of insertion into the list.  Or, if all else fails, at the point where the
local goes out of scope, we can insert code to re-insert the backpointer if
still appropriate.  There is room for yet more tricks:  E.g., if you know all
pointers are 0 mod 8, you could store counters up to 6 or so in the bottom
three bits, preserving the address in the remaining bits.  Removing the pointer
that matches the backpointer clears those preserved bits, unless it sets the
count back to 1 - at which point the backpointer can be treated as live
again.  In situations where references are created and destroyed in stack
order - probably common if you create an object, call various initialization
routines on it, then finally insert it into a data structure - the end result
is a count of 1, with a backpointer.)

Now it's really easy to relocate those objects that are only pointed to from
one place.  (You still have to find them all, but there are many approaches
to that - e.g., you may know how to find all allocated memory blocks and have
some way to determine type information.)  Further, the move can be done using
only local information (pointer and pointee) so is easy to do incrementally.

If most objects are only pointed to from a single place - highly likely; and
you might even be able to make it more likely by, for example, handling
pointers from the stack specially - then you might find that you can do a
pretty good job of compaction even with the occasionally "very busy object"
plopped down in the wrong place.  And I suppose you could do a collection for
those objects once in a while.
							-- Jerry