[gclist] synchronization cost (was: Garbage collection and XML)

Emery Berger emery@cs.utexas.edu
Fri, 9 Mar 2001 14:09:41 -0600


> -----Original Message-----
> From: Boehm, Hans [mailto:hans_boehm@hp.com]
> Sent: Friday, March 09, 2001 12:07 PM
> To: 'Emery Berger'; Boehm, Hans; 'David Chase'; gclist@iecc.com
> Cc: icis-developers@bbn.com
> Subject: RE: [gclist] synchronization cost (was: Garbage collection and
> XML)
>
>
> Does anyone know if this is documented somewhere?
>

http://developer.intel.com/design/pentium4/manuals/24547203.pdf

See Chapter 7.1. "For the P6 family processors, locked operations serialize
all outstanding load and store operations (that is, wait for them to
complete). This rule is also true for the Pentium 4 processor, with one
exception: load operations that reference weakly ordered memory types (such
as the WC memory type) may not be serialized. "

-- Emery


> Since these are out-of-order machines, the other question is whether
> subsequent instructions that don't depend on later memory references will
> continue to execute during the wait.  If so, this might explain
> some of the
> diffferences in measurements.
>
> Hans
>
> > -----Original Message-----
> > From: Emery Berger [mailto:emery@cs.utexas.edu]
> > Sent: Thursday, March 08, 2001 6:51 PM
> > To: Boehm, Hans; 'David Chase'; gclist@iecc.com
> > Cc: icis-developers@bbn.com
> > Subject: RE: [gclist] Garbage collection and XML
> >
> >
> > > Is that an X86 machine?  I just timed a Pentium III/500/100
> > machine at
> > > something near 25 cycles per
> > > "lock; cmpxchgl".  I'm interested because I've sometimes
> > heard the claim
> > > that X86 is particularly bad at this, but that hasn't really been
> > > consistent
> > > with my experience.  Is this chipset dependent, perhaps?
> >
> > Timing just the "lock; cmpxchgl" doesn't give you the whole
> > picture. The
> > problem is that the Pentium flushes the pipeline when it
> > encounters a locked
> > instruction. The performance penalty is pretty spectacular.
> > I'm told the P4
> > has a 24-stage pipeline, so locked instructions will become
> > effectively even
> > more expensive.
> >
> > Regards,
> > -- Emery
> >
> > --
> > Emery Berger
> > emery@cs.utexas.edu
> > http://www.cs.utexas.edu/users/emery
> >
>