[gclist] Tradeoffs in collectors

Paul R. Wilson wilson@cs.utexas.edu
Thu, 21 Mar 1996 21:59:18 -0600


>From majordom@iecc.com Thu Mar 21 20:42:04 1996
>Date: Thu, 21 Mar 1996 18:25:41 -0800
>To: gclist@iecc.com
>Subject: [gclist] Tradeoffs in collectors
>
>I've seen a lot of work done on making GCs fast or incremental.  What
>kinds of systems are recommended for saving space (both data and
>code)?  

Reference counting can have an advantage here.   The really nice thing
about reference counting is that you get most garbage space back
*really* fast.  But then there's the problem of cycles.

>In my case, I'm writing a small DOS application that needs to fit in
>300k, so I want to make something really simple that doesn't have a
>lot of overhead per object.  I also don't have VM tricks to play with.
>A two-space copying collector is out due to space reasons.  :(
>
>I would expect that GCs for small embedded systems would be what I
>need:
>
>	- small data overhead
>	- small collector code size
>	- small address space (no VM)
>
>Unfortunately, I don't know what kinds of GCs, if any, are used for
>embedded systems.

I suspect that most embedded systems don't use GC's at all, rightly
or wrongly.

>BIBOP would probably save a lot of header space.

It depends.  Are all of your objects the same size?  Do you expect much
fragmentation-causing behavior?  (E.g., phase behavior where different
phases use different-sized objects?)  You might want an allocator
that does general coalescing if you can't count on the program
being fairly reasonable with respect to fragmentation.

How tight are your space bounds.  Is the 300K a must-not-exceed limit
(like a fixed amount of precious low memory---or whatever they call it for
a TSR?  Or does the app just have to be "generally" well-behaved and
"almost always" under 300KB?

> Which collectors
>have a simple implementation?  Are any of these good for interactive
>systems?  Overall speed is not a problem, but noticable delays might
>annoy people (Emacs comes to mind!).

If speed really isn't a problem, and you don't need real-time, a
simple mark-sweep or mark-compact GC seems reasonable.  You just
GC really often so that you keep the live/dead ratio up.  (E.g., leave
20% free space, 80% live, and trace 4 times as much as you allocate
between GC's.)

How fast is your processor?  You should be able to GC 300KB in a very
small period of time on a reasonable-speed processor;  it's likely
that nobody'd ever notice.  That's basically how the Newton GC works
(or used to work)---if you have a decent chip you can GC a few hundred
KB so fast that it's hard to notice.   Emacs GC's are more noticeable
because Emacs heaps are typically bigger, often much bigger.
(Also, as Henry points out, Emacs is silly enough to *tell* you it's
pausing, even when you'd otherwise not notice.)