[gclist] Finalizers & Reference counting.

Boehm, Hans hans_boehm@hp.com
Mon, 26 Aug 2002 18:15:28 -0700

On the architectures I've tried recently, atomic instructions (which hit the cache) seem to take on the order of 20 cycles, which I wouldn't call cheap.  On a Pentium 4, unlike a Pentium II/III, the number seems to be on the order of 100.  It may be slightly less for other architectures, but I don't think it's ever negligible.  For a pointer assignment that used to be a register move, you're talking a slowdown of something like a factor of 50 or more for the reference-counted version.


> -----Original Message-----
> From: David Chase [mailto:chase@world.std.com]
> Sent: Wednesday, August 21, 2002 4:06 PM
> To: gclist@iecc.com
> Subject: Re: [gclist] Finalizers & Reference counting.
> At 01:26 PM 8/21/2002 -0400, Jerrold Leichter wrote:
> >(Almost) all modern architectures provide cheap ways to increment or
> >decrement and count atomically, and for the decrement case 
> determine whether
> >the result was 0.  The methods vary form interlocked 
> increment/decrement
> >operations that set condition codes based on the result to 
> combining fetch-
> >and-add operations to general load linked/store conditionals.
> Is "modern architectures" code language for "not x86"?
> LOCK-prefixed instructions there are not "cheap" in the
> usual sense of the word, at least according to my understanding
> and measurements (I generally care about locked CMPXCHG).
> On a uniprocessor it is atomic if unlocked, but not on a
> multiprocessor.
> David Chase