[gclist] GC topics

Jerry Leichter leichter@smarts.com
Tue, 20 Feb 1996 14:25:58 -0500


| >  -- does anyone have any practical numbers on how much more or less storage
| >     GC'ed environments use than non-GC'ed.  On the one hand, there's always
| >     some dead storage not collected yet, but on the other hand there's no
| >     storage leaks
| 
| No. I have no figures on the cost in terms of storage of GC, but we have a 
| complete performance analysis on the current implementation of the YAFL
| programming language, and the results are interesting: if we remove all
| reference to GC from the generated code, the total execution time of
| the system drops to 70-80% of its original execution time (of course,
| this is only for analysis purposes, such a system is barely usable, since
| it just allocates memory and then crashes after a while...).
| 
| So, if keeping track of accessible references at all times cost 20% of
| our execution time (which we consider as reasonable), what is the gain,
| that is, how efficient is a precise GC when compared with a 
| conservative one which must perform complex heuristics to determine
| whether a given memory area happens to be a pointer or not. Does anyone
| have any data on this ?
| 
| We consider this performance overhead acceptable, since it guarantees
| a quasi-deterministic behaviour and is utterly portable, since it
| does not rely on any platform-specific feature (YAFL has even been ported
| to IBM mainframes).
| 
| Any comment ?

Your number is interesting, but not as interesting as it could be.  The problem 
is in what you're comparing.  The system with no GC code also has no code to 
explicitly free memory, no code to compact memory (even to coallesce adjacent 
free blocks), perhaps even no code to search for an appropriate block.  (Without 
knowing what your GC does, we can't guess at that.  The allocator that goes 
along with a compacting or copying collector, for example, simply does a compare 
and an increment - it can assume that all free memory is in one contiguous 
block.)

If the "comparison" system did actual allocation and deallocation, it would cost 
more.  How much more depends on many details.  Most comparisons of systems with 
garbage collection to systems without it, but with decent allocators, show 
around a 10% cost (folk result).  That's a fairer comparison, but even so it's 
inherently suspect:  Code written assuming a garbage collector is organized 
differently from code that has to do explicit allocation and deallocation.  
Code that assumes GC is probably much more likely to allocate dynamic memory, 
since it's so easy:  LISP programmers don't go out of their way to avoid CONS 
unless they absolutely have to!  This may lead to code that does more allocation 
than is strictly necessary, making GC systems look slower.  On the other hand, 
code written without a GC, especially well-structured, modular code that has to 
be able to run safely in complex environments - i.e., just the sort of code we'd 
like to encourage! - may end up making private copies of data so as to avoid 
coupling through various "must delete when finished" constraints.  This can make 
*non*-GC code look slower.  How these two effects (and others) interact in any 
given program is impossible to predict, but either could easily swamp any 
"inherent" difference - whatever that means - between the GC and non-GC choices.

							-- Jerry