[gclist] Finalizers & Reference counting.
Mon, 26 Aug 2002 18:15:28 -0700
On the architectures I've tried recently, atomic instructions (which hit the cache) seem to take on the order of 20 cycles, which I wouldn't call cheap. On a Pentium 4, unlike a Pentium II/III, the number seems to be on the order of 100. It may be slightly less for other architectures, but I don't think it's ever negligible. For a pointer assignment that used to be a register move, you're talking a slowdown of something like a factor of 50 or more for the reference-counted version.
> -----Original Message-----
> From: David Chase [mailto:firstname.lastname@example.org]
> Sent: Wednesday, August 21, 2002 4:06 PM
> To: email@example.com
> Subject: Re: [gclist] Finalizers & Reference counting.
> At 01:26 PM 8/21/2002 -0400, Jerrold Leichter wrote:
> >(Almost) all modern architectures provide cheap ways to increment or
> >decrement and count atomically, and for the decrement case
> determine whether
> >the result was 0. The methods vary form interlocked
> >operations that set condition codes based on the result to
> combining fetch-
> >and-add operations to general load linked/store conditionals.
> Is "modern architectures" code language for "not x86"?
> LOCK-prefixed instructions there are not "cheap" in the
> usual sense of the word, at least according to my understanding
> and measurements (I generally care about locked CMPXCHG).
> On a uniprocessor it is atomic if unlocked, but not on a
> David Chase