[gclist] My copying collector or Boehm's?
Ji-Yong D. Chung
Fri, 16 Mar 2001 03:07:08 -0500
This question is regarding a design decision,
whether to use Boehm's collector or my copying
I have written a scheme interpreter in C++,
soley to manipulate XML parsing. I have not implemented
the XML parser yet, but I am planning to write one,
as a C++ extension to the scheme interpreter.
Its memory management will be done by the garbage collector for
the interpreter itself.
To get ready to write this extension, I have just
replaced my super simple collector
(which uses Cheney's algorithm)
with Boehm's collector. After a number of test runs
comparing both collectors, I cannot decide whether to
keep my original collector or use Boehm's.
With Boehm's collector, my interpreter runs about 2.5
times slower than before. This is not a knock on Boehm's
collector. The drop off in performance was expected,
because (1) my original collector is turned on and off
at precise points in my C++ code to minimize
collection (2) my collector uses type information all the time,
(3) it uses no locks for allocation, because it has a
separate heap for each thread.and (4) heap residency
was low for the test cases. -- which favors copying
collector over mark-sweep.
Of course, Boehm's collector offers things
other than raw performance. First of all, it is written
for C/C++, so that if I writeXML extensions in C++,
Boehm's collector fits in nicely with my C++ implementations
My collector, on the other hand, is a copying collector,
and it is not generalized for C/C++. If I write any scheme
extensions in C++, I must "protect" local C++
variables that might lose its pointer to Scheme objects
(again, implemented in C++) due to the garbage collector.
This makes C++ code writing
more painful than it would be if I were using
Secondly, for each class, I must implement
a static "Move" function (for copying one
object from FromSpace to ToSpace). Writing
support structures to dispatch this at high speed makes
my code much more complex (and ugly).
I have difficult time deciding whether to
use Boehm's collector or not because (1) on one
hand, I have read that I am generally supposed to
sacrifice performance for design improvement.
(2) on the other hand, with Boehm's collector,
my interpreter runs 2.5 times as slow -- which
maybe too much to sacrifice.
In summary, using Boehm's collector
would simplify C++ code writing, but it would make
my code regrettably slower.
Which one should I use? I have been banging
my head against walls for a few
days. I would appreciate
any comments or insights
that would alleviate my headache.
P.S. In case anyone asks me what are my
"user requirement" -- Because I am writing a "new"
type of application, I am not sure at this point if there is
such a thing. There is some chance that performance
will become an issue, so that I have always been trying not
to be ad-hoc in my design decisions.