[gclist] My copying collector or Boehm's?

Ji-Yong D. Chung virtualcyber@erols.com
Fri, 16 Mar 2001 03:07:08 -0500


    Hi,

    This question is regarding a design decision,
whether to use Boehm's collector or my copying
collector.

    I have written a scheme interpreter in C++, 
soley to manipulate XML parsing.  I have not implemented 
the XML parser yet, but I am planning to write one, 
as a C++ extension to the scheme interpreter.
Its memory management will be done by the garbage collector for 
the interpreter itself.

    To get ready to write this extension, I have just
replaced my super simple collector 
(which uses Cheney's algorithm)
with Boehm's collector.  After a number of test runs 
comparing both collectors, I cannot decide whether to 
keep my original collector or use Boehm's.
    
    With Boehm's collector, my interpreter runs about 2.5
times slower than before.  This is not a knock on Boehm's 
collector.  The drop off in performance was expected, 
because (1) my original collector is turned on and off 
at precise points in my C++ code to minimize
collection (2) my collector uses type information all the time, 
(3) it uses no locks for allocation, because it has a
separate heap for each thread.and (4) heap residency 
was low for the test cases. -- which favors copying 
collector over mark-sweep.

    Of course, Boehm's collector offers things
other than raw performance.  First of all, it is written
for C/C++, so that if I writeXML extensions in C++,
Boehm's collector fits in nicely with my C++ implementations
of Scheme.

    My collector, on the other hand, is a copying collector,
and it is not generalized for C/C++.  If I write any scheme
extensions in C++, I must "protect" local C++
variables that might lose its pointer to Scheme objects
(again, implemented in C++) due to the garbage collector.  
This makes C++ code writing
more painful than it would be if I were using
Boehm's collector..  

    Secondly, for each class, I must implement
a static "Move" function (for copying one
object from FromSpace to ToSpace).  Writing
support structures to dispatch this at high speed makes
my code much more complex (and ugly).

    I have difficult time deciding whether to 
use Boehm's collector or not because (1) on one
hand, I have read that I am generally supposed to 
sacrifice performance for design improvement.
(2) on the other hand, with Boehm's collector, 
my interpreter runs 2.5 times as slow -- which
maybe too much to sacrifice.

    In summary, using Boehm's collector
would simplify C++ code writing, but it would make
my code regrettably slower.
   .
    Which one should I use?  I have been banging 
my head against walls for a few 
days.  I would appreciate 
any comments or insights
that would alleviate my headache.  


P.S.    In case anyone asks me what are my
"user requirement" --  Because I am writing a "new" 
type of application, I am not sure at this point if there is 
such a thing.  There is some chance that performance
will become an issue, so that I have always been trying not
to be ad-hoc in my design decisions.