[gclist] Re: gclist-digest V1 #42

Robert A Duff bobduff@world.std.com
Fri, 19 Apr 1996 15:50:54 -0400


David Chase <chase@centerline.com> writes:

> for <many iterations> do
>   construct a big hairy graph of heap-allocated nodes from scratch
>   manipulate it a lot
>   drop it on the floor
> end
> 
> To test for the existence of a garbage collector, there has to be 
> a small number of iterations after which the resource consumption 
> of the program grows very slowly, if at all.

Sure, that's fine in practise.  But it seems to me that language
definitions have to be more precise than that.  I don't think it's
feasible to define what you called "very slowly", above, for example.
Also, most languages don't have primitives for measuring "resource
consumption", at least not anything that's precisely defined.  No, the
only thing you can do is use up some resources, and see if you run out.

Suppose the above crashes during the first iteration.  That does *not*
prove that the system has no gc -- maybe the system just doesn't have
enough memory.  Suppose it doesn't crash after "many" iterations.  That
doesn't prove it *does* have gc -- maybe it just has lots of memory, and
maybe if you had been willing to wait for one more iteration, it would
have crashed.  In practise, you can know how much memory is on a
particular machine, and write the above test accordingly, but *language*
*definitions* don't normally say anything about how much memory is
available.

I'm being pedantic here, of course.  But isn't that what language
definitions are supposed to do?  A lot of people want language
definitions to be written in some formal notation.  It's not clear to me
how to say, formally, or even semi-formally, that the above test has to
pass.

One way to get language implementers to support gc is to *not* put any
manual storage-freeing operation in the language.  For a language like
C++, I don't think the right approach is to put words in the language
standard.  The right approach is to produce enough implementations that
support gc that everybody is eventually forced to do so, or go out of
business.  In other words, a de-facto standard is more appropriate for
this.  Once people actually *use* gc (if it's any good), they tend to
*like* it.

> > IMHO, gc is an implementation issue.
> 
> I disagree, for two reasons.  The degreee of difference between GC'd 
> and not GC'd is extraordinary -- without GC, there are many programs 
> that will simply not run, because they run out of memory. That is, 
> the amount of time that you can run the progam is bounded by how 
> much memory you have.  The existence of GC removes that linkage. 

Shrug.  It seems no different from efficiency.  There are C
implementations that run several orders of magnitude faster than other C
implementations.  (Mainly because of hardware differences, but also
software differences.)  But you won't find anything in the C standard
telling you how long it takes to add two numbers.  The fact that it's a
big important issue doesn't make it a language issue.

> Second, with GC, you write different programs and libraries, because 
> you don't need to fool with memory allocation.

Sure.  I also write code differently for a brain-damaged segmented
architecture with 640K address space, than I do for Linux with lots of
memory.  That doesn't mean the amount of memory available is a
*language* issue.

> > To put gc in a *language definition*, one would need a model of storage
> > usage that is far more detailed than is typical.
> 
> I think it would be a useful exercise, and now that I think about 
> it, not incredibly difficult, either.

Yes, I agree it would be useful, and perhaps even feasible.  But I don't
think it's easy.  (Nobody's done it, that I know of.)  You need to allow
implementations freedom to do all kinds of weird things with storage,
which complicates the issue.  And you need to deal with all sorts of
computer architectures.  And you need to care about worst-case storage
usage, which may be useless if it's very far from the average case.  And
fragmentation seems like a big headache to define formally.

And if you want to allow conservative GC, the problem seem intractable,
since for any given conservative GC, you can write a program that makes
it leak arbitrary amounts of storage.  I have no idea how you would
model the probablistic nature of conservatism.

- Bob