[gclist] why malloc/free instead of GC?

Arlie Davis arlie@sublinear.org
Wed, 19 Feb 2003 17:18:53 -0500

Also, note that most apps that use malloc/free for typical "class"
objects (small-to-medium size, with significant pointer density) perform
some sort of class initialization.  It may be a field-by-field
initialization of pointers, or (more often) it is a bulk zero fill
(modulo vtable setup).  The time to do this, and the cache misses, will
not show up in traces of malloc/free cost, but do show up in GC

So, there is yet another reason that direct, API-level comparisons of GC
vs. malloc are inaccurate, or at least incomplete.  A better (though
still incomplete) comparison would be total time spent in, say, C++
new/delete, to GC alloc / GC collect.

Also, in environments that mix reference counting with unmanaged heaps,
such as COM development on Win32, you must also account for the time
spent in AddRef and Release.  Most thread-safe implementations use
interlocked integer primitives, which are quite costly on SMP machines.

I've done a fair amount of profiling of real-world server apps on Win32,
and in many implementations, SMP scalability is severely hindered by the
very high frequency of interlocked operations.  In services that make
heavy use of COM interfaces, reference counting is often one of the
biggest users of interlocked access.

All of this must be taken into account when considering the behavior of
real-world, complex applications & services, and how they use memory.

-- arlie

-----Original Message-----
From: owner-gclist@lists.iecc.com [mailto:owner-gclist@lists.iecc.com]
On Behalf Of Boehm, Hans
Sent: Tuesday, February 18, 2003 6:53 PM
Cc: gclist@iecc.com
Subject: Re: [gclist] why malloc/free instead of GC?

It looks to me like much of this difference can still be explained by
the fact that GC_malloc initializes the resulting objects, and hence
takes the cache misses that a real client would otherwise take later.
To make the measurements more comparable, you should initialize the
objects after you allocate them.  (I still wouldn't expect GC_malloc to
win.  I've normally seen that only for cons-cell sized or slightly
larger objects.)

Are there also differences in the amount of thread support that's
included in the measurements?  E.g. the system malloc usually tests a
global to determine at runtime whether it needs to lock.


> -----Original Message-----
> From: Basile STARYNKEVITCH [mailto:basile@starynkevitch.net]
> Sent: Tuesday, February 18, 2003 2:14 PM
> To: gclist@iecc.com
> Subject: Re: [gclist] why malloc/free instead of GC?
> For completeness, I changed my tiny test a bit to allocate smaller 
> objects, to take into account Hans Boehm's remark on typical object 
> size
> ...