[gclist] Concerns about GC

Kim Barrett kab@camellia.org
Sun, 29 Sep 1996 23:04:38 -0500


At 10:47 AM -0500 9/27/96, Pinku Surana wrote:
>	I am quite familiar with languages implemented with GC behind
>the scenes. My main development language is LISP and I've used Boehm's
>collector for some projects written in C. I've used various other
>systems and languages with a myriad of different implementations of
>GC (Smalltalk, ML, Scheme, etc).
>
>	Nevertheless, I am still of the opinion that GC should be used
>in a select number of application domains. Furthermore, any language
>that *requires* GC in the runtime system (either explicitly or
>implicitly) dooms the language to a small group of users.
>
>	The problem is that GC does not make economical use of a
>valuable resource: memory. The initial start-up requirements for most
>GC implementations is 8M of memory, though more often 16M. Everytime a
>program requires more than 16M, the system gobbles more memory and
>never gives it up (I think Emacs returns free buffer space to the OS,
>though).

I don't know where your "most GC implementations" come from; the three I
happen to have instantly handy (*) certainly don't have the problem you
describe.  I also wonder whether you have tried to determine actual
minimums for other programs using GC, have simply looked at some default
parameters and made (possibly invalid) assumptions based on those, or have
looked at applications which use GC and may have been tuned (possibly by
default) for a particular expected memory usage.

Regarding your complaint that the GC gobbles more memory and never gives it
back, that is again implementation-dependent (**), and I have encountered
more than one C malloc implementation against which exactly the same
complaint could be made.

* Two programs using GC which don't require a minimum of 8M of memory:

  1. Macintosh Common Lisp - the default partition size for the 68k
     implementation is 4M; it increased to around 5M in the PowerPC
     implementation.

  2. Apple Dylan applications.  The default application size is 2M, but that=
 is
     known to not be the minimum application size.  (I don't remember what t=
hat
     minimum actually turned out to be, but there is at least one sample
     application on the Apple Dylan TR CD which runs in 1M, and several whic=
h
     require no more than 2M.)

  3. IS Robotics uses a GC'ed lisp dialect to program embedded processors us=
ed
     to control small autonomous robots.  These robots typically have no mor=
e
     (and sometimes less) than 1M of RAM, total.

** The Apple Dylan GC might return memory to the OS under some conditions.
I know the design included this feature, though I'm not sure it actually
got implemented.  (Moon might remember whether he had time to finish it,
but I don't know whether he reads this list.)  It also does not require all
of the memory it is dealing with to be in a small number of large
contiguous blocks (unlike some older collectors), being based on
"relatively" recent ideas from the GC community.  (I say relatively recent
because the design and most of the implementation were completed 3-4 years
ago.)

>	Though the speed of a program is an important measure of
>program performance, the space requirements are also important. For
>example, there are plenty of data structures that are fast for certain
>applications, but are infeasible because of the space
>requirements. These trade-offs are made all the time for real
>programs, but GC immediately chooses speed over space.

If you look at the GC literature (particularly some of recent vintage) it
will quickly become obvious that a lot of thought is being put into the
space overhead imposed by various collectors, because it can have such a
strong impact on performance.  Not just paging overhead; these people are
also looking closely at cache-level impact.  I don't off the top of my head
know precise overheads for various collectors that are around these days,
but I would not want to wager anything substantial on the overhead of the
better ones being worse than one would get by using a common C malloc + the
program-level extra stuff to explicitly keep track of when a chunk of
memory with non-trivial relationships to other chunks can be free'd.  And
of course, as I noted before, just because you tell C malloc that you are
done with some memory doesn't necessarily mean that some other program will
be able to make use of that memory.

>	GC'ers claim, correctly, that this technology makes the
>construction of large systems easier, all those nasty memory leaks
>disappear. However, this savings for engineers comes at a cost to the
>end user by eating more of his memory. Furthermore, though one class
>of problems disappear (memory leaks) another is created: garbage that
>is trapped by badly written programs. Of course, the program still
>runs, but over time it consumes more and more space and the computer
>grows sluggish.

Um, isn't what you just described a memory leak?  And we've never seen a
program that used explicit memory allocation which had that kind of
problem, have we?