[gclist] GC performance comparison for various JVMs

Boehm, Hans hboehm@exch.hpl.hp.com
Mon, 18 Oct 1999 13:27:49 -0700


I just posted the following to a thread in comp.lang.java.advocacy.  The
discussion there was comparing the HotSpot collector to our (i.e.
Boehm-Demers-Weiser) collector.  This includes some flakey measurements of
GC performance of some JVMs.  Does anybody have better ones they want to
share?

Hans

--------------------------------------------

In article <7tj5hc$o2q@dfw-ixnews3.ix.netcom.com>,
  tsikes@netcom.com (Terry Sikes) wrote:
> I've not done any measurements, but are you
seriously suggesting that
> the Boehm collector is competitive without the
generational and fast
> object allocation improvements in HotSpot?
> 
Our collector has been generational (and
incremental) for a long time, though it can't
physically separate new and old objects.

I don't have a real comparison either, but maybe
these qualify as tangerine-to-orange comparisons. 
These are results for my modification of the
Ellis/Kovac GC benchmark.  This is a toy benchmark
that I believe has been used by a number of JVM
implementation groups.  It is not a real
application, but an attempt was made to make its
storage allocation behavior similar to real
applications.  YMMV.  The benchmark does very
little other than allocation, and tends to spend
the vast majority of its time in the garbage
collector, even with a fast collector.  Average
applications are less dependent on GC performance.
Send me mail at hboehm@hpl.<rem0ve me>hp.com if
you want a copy of the benchmark.

The Sun VM numbers are from a 200MHz PPro machine
(HP Vectra) running NT.  The other numbers are
from a 300MHz (faster bus?) PII (HP Kayak) running
Linux.  The second set of times was naively
multiplied by 1.5 to compensate for hardware
differences.

The numbers I get for overall time:
HotSpot		15 secs (though sometimes appreciably
higher for initial run. Why?)
Sun Classic	55 secs

Gcj (standard distr.)	23.4 * 1.5 = 35.1
Blackdown JDK 1.1.7	87.1 * 1.5 = 130.7
Kaffe (as distr with RH6) 93.2 * 1.5 = 139.8

(IBMs Linux JDK is a bit faster than gcj, by my
measurements.)

The remaining measurements are based on a C++
version of the benchmark, adapted by me to again
optionally use our garbage collector.  These are
again run under Linux on the faster machine.  They
use a not-yet-released version of our collector,
which is noticably faster (20%?) in incremental
mode than the released one:

New/delete	18.4*1.5 = 27.6
GC, stop world	19.3*1.5 = 29.0
GC, incr	14.1*1.5 = 21.2

If we give the collector a tiny bit of type
information (as gcj should, but may not), the last
two times both go down to between 12.2 and 12.5,
or about 18.5 secs after scaling to the 200 MHz
processor.

Memory use by our collector and HotSpot were
similar. 

If you look at the detailed results, it becomes
clear that Hotspot is faster for very short-lived
objects, and that our collector is faster for
longer-lived objects.  It is easy to construct
applications for which either one wins.  My guess
is that for typical applications, and in the
absence of static object lifetime analysis in gcj,
the HotSpot collector will usually win, but rarely
by a large amount.  

These are my opinions, not HPs.
If you don't like the results, please post yours.