[gclist] Advice requested

David Chase chase@world.std.com
Tue, 27 Apr 1999 14:52:05 -0400


I meant to reply earlier, Hans beat me to it, this is another
view on use of a conservative collector.

At 02:47 PM 4/27/99 +0100, Stephen Biggs wrote:
>o Our product works on several platforms (Windows plus Unix flavours,
>largely common-coded) 

>o COBOL allows object pointers to be unaligned

??? unaligned pointers, or unaligned objects (pointer values?)

>o We have to support multi-threading AND:

>o Existing procedural programs that have some OO extensions

>o Existing OO programs that have no knowledge of garbage collection

>o Programs that communicate with other languages either within process or
>(using COM or CORBA) outside the (easily) garbage collected environment.

-----------------------------------

>Is a Mark-Sweep garbage collector, given the restrictions above, even
>possible? Can anyone point me to a successful language or product that does
>such a thing?

Possible, though this is about as tough as anything
I've ever heard of.  I would try the Boehm-Weiser collector
if I were you.

>How expensive, generally, is true "conservative" garbage collection under
>these conditions?

It's not that bad.  The unaligned pointers are definitely a problem.  The
B-W collector copes, but they make everything worse.

>I'm also aware that garbage collection algorithms exist for some that have
>nearly all the restrictions. C/C++ is the nearest analogous language I can
>see for OO COBOL, but the algorithms that I've seen for that don't include
>binding to other languages

The advantage of a conservative collector is that it doesn't
really care what the "language" is -- bits is bits.

>or distributed garbage collection. Do any exist?

Ouch.

>Also, the only C++ garbage collectors I've seen don't cope well with a
>multi-threaded environment and aren't guaranteed to be 100% correct. I'm
>told by our Unix programmers here that thread suspension, for a start, is
>out (so for Unix the option would be to ask the programmer to call garbage
>collection methods manually, somehow!).

Untrue, though the solutions to this problem are semi-nasty.  But,
there are solutions.

>My Own Opinion:
>Is that, if garbage collection is required, reference counting is the best
>bet, given our restrictions (and the fact that we have control over our
>compiler!). Second would be to not implement any collector at all. Third
>would be some form of Mark-Sweep collector, which would imply severe
>restrictions on the use of OO COBOL. However, there are some here that are
>very keen on Mark-Sweep.

>Can people out there generally recommend any hybrid approaches that I can
>take a look at?

It seems to me that you must have some control of your object layout,
if you have the option of using reference-counting.  Therefore, I would
consider modifying Boehm-Weiser in the following way:

  stick something in your object header that contains the location of
  the pointers.  Use this information in the collector to improve
  your odds against unaligned pointers and/or unaligned objects.
  (This won't make your collection any faster, at least in the aligned
  case; it's been tried).

You may also need some header information for distributed GC.  I've
not done this (I HAVE ported B-W GC, including to a machine with
threads) so I am not 100% sure what is entailed, and I am not sure
what you do in a distributed uncooperative system (look for another
job, perhaps).

Note, too, that the B-W GC allows you to manually reclaim storage,
which is a big win in some environments (if you felt like it, you
could augment it with reference-counting, though I doubt that would
be a performance win.  But, you have the option, you can try the
experiment, and do it if it helps.  The memory allocator embedded
within that collector is first-rate; there is no reason not to use
it, even w/o the GC.)

>Finally, if you're prepared to give an opinion, what would you do?

Boehm-Weiser, given your constraints.  Conservative collection gives
some people the willies, but it is certainly adequate for any
non-adversarial environment (e.g., I wouldn't use it for a system
that loaded possibly hostile code, e.g., Java, though some Java
systems in fact do use it) and nothing else comes close to working
as well given the constraints that you describe.

David Chase