[gclist] Advice requested

Stephen Biggs stephen.biggs@merant.com
Tue, 27 Apr 1999 14:47:44 +0100


Hi all,

I hope that this list is current, and that someone here may be able to point
me in a suitable direction. The web sites and papers I've read are very
interesting, but I'm not sure they apply directly to my real-world problem
:-)

Background:
The currently proposed next ISO COBOL standard (reasonably close to final
ratification) includes OO syntax within the language.
As the standard has been discussed for several years, this isn't new: the
company that I work for (Merant) has produced a COBOL compiler and run-time
that supports an OO syntax for some time.

However, the standard implies (clarity not being its strong point) that a
garbage collector is expected to be running, or at the very least available,
to collect "dead" OO objects. (Explicit memory allocation/deallocation is
largely foreign to the COBOL programmer, although so is OO, admittedly). 
Our current implementation does not include a garbage collector (objects are
collected 'manually' by the programmer, using a predefined method/message).

Now, we've tried a reference counted approach, and due to various factors
(architectural, mainly, but also the fact that existing OO COBOL programs,
such as those using our class libraries, are usually expected to run
unchanged, ie without recompilation even!) the original work for this was
abandoned. This doesn't mean that this method won't get revived, but if so,
it will be considered via a different architectural apporach.

We've come to considering a Mark-Sweep collector, now, and it has occured to
me (at least) that this approach may not be workable. Why? Well:
o Our product works on several platforms (Windows plus Unix flavours,
largely common-coded) 
o COBOL allows object pointers to be unaligned
o We have to support multi-threading AND:
o Existing procedural programs that have some OO extensions
o Existing OO programs that have no knowledge of garbage collection
o Programs that communicate with other languages either within process or
(using COM or CORBA) outside the (easily) garbage collected environment.

The limited investigation done so far (single-threaded, bound environment)
has proved that performance could also become a problem if these further
restrictions are taken into account (at the moment, the Mark-Sweep we have
running is little better than manual reclamation of dead objects, on simple
testbeds. Given that our OO COBOL implementation is based on a dynamic
model, rather than statically linked, performance is probably the biggest
issue.).

Advice Requested:
(NB: Forget that this is a commercial decision - I'm *not* asking anyone to
make that for us! All I'm really asking for is a pointer to more information
to enable us to make the decision, and perhaps garner a few external
opinions about general garbage collection along the way. Then we'll make our
own mistakes :-)

Is a Mark-Sweep garbage collector, given the restrictions above, even
possible? Can anyone point me to a successful language or product that does
such a thing?

How expensive, generally, is true "conservative" garbage collection under
these conditions?

I'm aware that some languages without those restrictions (which could be
considered as "bound"/self-contained OO environments, since they are almost
all "new", "pure" OO languages that operate alone, such as Java) exist and
implement Mark-Sweep very well. (Although I know the COM extensions for Java
mean that some special rules are introduced.)

I'm also aware that garbage collection algorithms exist for some that have
nearly all the restrictions. C/C++ is the nearest analogous language I can
see for OO COBOL, but the algorithms that I've seen for that don't include
binding to other languages or distributed garbage collection. Do any exist?

Also, the only C++ garbage collectors I've seen don't cope well with a
multi-threaded environment and aren't guaranteed to be 100% correct. I'm
told by our Unix programmers here that thread suspension, for a start, is
out (so for Unix the option would be to ask the programmer to call garbage
collection methods manually, somehow!).

My Own Opinion:
Is that, if garbage collection is required, reference counting is the best
bet, given our restrictions (and the fact that we have control over our
compiler!). Second would be to not implement any collector at all. Third
would be some form of Mark-Sweep collector, which would imply severe
restrictions on the use of OO COBOL. However, there are some here that are
very keen on Mark-Sweep.

Can people out there generally recommend any hybrid approaches that I can
take a look at?

Finally, if you're prepared to give an opinion, what would you do?

Many thanks for any information,
Stephen.