[gclist] ref-counting performance cost

Richard A. O'Keefe ok@atlas.otago.ac.nz
Tue, 5 Sep 2000 15:29:05 +1200 (NZST)


There are three problems with naive reference counting.

1.  Inability to clean up cycles.

2.  The time taken to do the reference counting operations.
    A simple
	x <- y
    turns into
	ref(y)++
	if (ref(x)-- == 0) bury(x)
	x = y

3.  The space required for the reference counts.  In a Lisp-like
    system, a cons cell bloats from 2 pointers to 3.

The Bobrow&c scheme used for Interlisp-D tackles 2 and 3 by

a.  Not counting references from the stack(s) at all, so the
    counts in memory reflect references from within the heap.
    This meant that the stack(s) had to be scanned occasionally
    to tell which zero-count objects were really free, but the
    stacks were constrained to be small (16-bit word index for
    SP in Interlisp-D, IIRC).

b.  Limiting the counts.  My memory fuzzes out here; I think the
    possible states are "0, 1, many", with 1 being the norm, and
    "many" being sticky; things with 0 count went into a giant
    hash table called the Zero Count Table.

For what it's worth, Xerox Quintus Prolog, which used a fairly
simple-minded mark-and-compact collector, could run rings around
Interlisp-D on the same hardware (1108, 1109, 1185, 1186).  But
that is probably because Interlisp-D used CDR-coding and we didn't.

We managed to fill the ZCT up a couple of times too, without meaning to.