[gclist] ref-counting performance cost
Richard A. O'Keefe
ok@atlas.otago.ac.nz
Tue, 5 Sep 2000 15:29:05 +1200 (NZST)
There are three problems with naive reference counting.
1. Inability to clean up cycles.
2. The time taken to do the reference counting operations.
A simple
x <- y
turns into
ref(y)++
if (ref(x)-- == 0) bury(x)
x = y
3. The space required for the reference counts. In a Lisp-like
system, a cons cell bloats from 2 pointers to 3.
The Bobrow&c scheme used for Interlisp-D tackles 2 and 3 by
a. Not counting references from the stack(s) at all, so the
counts in memory reflect references from within the heap.
This meant that the stack(s) had to be scanned occasionally
to tell which zero-count objects were really free, but the
stacks were constrained to be small (16-bit word index for
SP in Interlisp-D, IIRC).
b. Limiting the counts. My memory fuzzes out here; I think the
possible states are "0, 1, many", with 1 being the norm, and
"many" being sticky; things with 0 count went into a giant
hash table called the Zero Count Table.
For what it's worth, Xerox Quintus Prolog, which used a fairly
simple-minded mark-and-compact collector, could run rings around
Interlisp-D on the same hardware (1108, 1109, 1185, 1186). But
that is probably because Interlisp-D used CDR-coding and we didn't.
We managed to fill the ZCT up a couple of times too, without meaning to.