[gclist] Re: gclist-digest V2 #106

Michael Spertus mps@geodesic.com
Fri, 16 Jan 1998 13:32:33 -0600


At 11:05 AM 1/16/98 -0000, you wrote:
>
>gclist-digest         Friday, January 16 1998         Volume 02 : Number 106
>
>
>
>In this issue:
>
>    [gclist] Experience with conservative GC sought
>
>----------------------------------------------------------------------
>
>Date: 16 Jan 1998 09:41:51 +0100
>From: sperber@Informatik.Uni-Tuebingen.De (Michael Sperber \[Mr.
Preprocessor\])
>Subject: [gclist] Experience with conservative GC sought
>
>I'm looking for accounts, if part of research or just anecdotal, of
>experience with using conservative GC in long-running applications
>with large heaps and big changes in the size of the live data.
>
>Both the Emacs and XEmacs development team have long-term plans to
>replace their Lisp engines.  One of the big design issues is whether
>to commit to conservative GC.  The present C substrates of these two
>editors presently use precise GC annotations which are a constant
>source of annoying bugs.  However, the present Emacsen have very small
>leakage which typically allows one invocation to stay up for weeks or
>even months.  Losing this property would be very undesirable.
>
>Hans Boehm has provided me with his positive experience with the Cedar =
>
>system, and a few other applications.
>
>My own experience is mainly with Scheme implementations.
>
>Rice's PLT suite of Scheme implementation uses the Boehm collector.  All
>of them leak heavily, even when the size of the live data demonstrably =
>
>stays almost constant.  It is usually necessary to restart interactive =
>
>sessions every hour or so because they have grown beyond 40 Megabytes.
>
>scm uses its own conservative collector which only scans the stack
>conservatively but does precise tracing for objects inside the heap.
>I have used scm to run programs with (comparatively) high rates of
>data allocation.  The experience has been similar as with PLT: The
>heap grows far beyond the size of the live data and doesn't shrink
>back.  Moreover, as the size of the live data within a huge heap was
>small, fragmentation caused substantial thrashing; the OS would
>typically try to keep the whole heap in memory.  Note that GNU Guile
>uses scm's collector.
>

Our collector, Great Circle, usually does much better on space utilization
than the Boehm collector. In particular, we have several important
space-reduction techniques we have added to the Boehm code. Would there be
an opportunity for us to try Great Circle in these programs and see how it
does? 

>The only published account I have of this kind of thing is:
>
>@Article{went90,
>  title =3D        "Pitfalls of Conservative Garbage Collection",
>  author =3D       "E. P. Wentworth",
>  journal =3D      "Software Practice and Experience",
>  publisher =3D    "Wiley",
>  volume =3D       "20",
>  number =3D       "7",
>  pages =3D        "719--727",
>  year =3D         "1990",
>  abstract =3D     "Researchers have recently proposed conservative
>                 garbage collection as a technique which allows smooth
>                 integration of automatic memory management into
>                 existing high-level languages. Conservative techniques
>                 carry the cost of some leakage --- they fail to reclaim
>                 all the available garbage. This paper reports on a
>                 conservative collector and measurements of its leakage.
>                 The results indicate that the loss depends critically
>                 on the application =3D {"}a Lisp interpreter shows very
>                 small leakage (typically less than 5 per cent of the
>                 total heap space for classroom examples), whereas the
>                 garbage collector collapses completely when used with a
>                 KRC interpreter. Uncovering the reasons reveals that it
>                 is possible to write Lisp programs that also fail.
>                 Previous work has used very sparsely populated address
>                 spaces where the probability for leakage was minimized.
>                 This work considers small, densely populated address
>                 spaces, and probably represents a worst-case scenario.
>                 Although the conservative technique appears to be a
>                 cost-effective and clean way of integrating automatic
>                 memory management into an off-the-shelf host language,
>                 we are led to conclude that it is not reliable in all =
>
>                 situations.",
>}  =
>
Overall, I agree with this conclusion. The interesting question is: Where
is the dividing line between when conservative is enough, and when you need
precise collection? Our experience is that you can do a lot better with
conservative collection than suggested by the above statistics.

Michael Spertus                            mps@geodesic.com
Geodesic Systems                           (312) 832-1221
414 N Orleans, Suite 410                   http://www.geodesic.com
Chicago, IL 60610