[gclist] Language behaviour wrt GC (Was: Name that hypothesis)

Paul R. Wilson wilson@cs.utexas.edu
Tue, 10 Dec 1996 06:28:08 -0600


>From majordom@iecc.com Fri Dec  6 15:23:30 1996
>
>On Dec 5, 11:34pm, Mark Tillotson wrote:
>
>> [ To back up a hypothetical situation with a real one, I will point
>>   out that I regularly use an interactive language system image of
>>   some 80MB total heap, and don't get any noticeable pauses for
>>   GCs (its generational, non-incremental).  In the last 20 minutes or so
>>   I ran some code that allocated over 300MB of short-lived objects, and
>>   1.2% of the CPU time went on GC, in about 1200 pauses of 10ms each
>> ]
>>
>
>On the other hand, Barry Hayes made a number of measurements of the Xerox PARC
>Cedar system for which on the order of 10% of the bytes allocated during
>moderately long runs survived until the end of the run.

I think this is an important point, but it's important to realize that
we're comparing apples and oranges here.

(Warning: the following is a "big picture" ramble that many of you
have heard most of before.  I thought it might be worth reiterating
if we're talking about basic effects of languages on style and
lifetimes.)

We're comparing a system with some kind of persistent object store
(even if it's just a simple system image save/reload) to one without.
If you don't have persistence in the language, you do it manually
through the file system, and long-lived objects "escape" the GC'd heap
and "come back" next time you start up and reload the data.
This has weird implications for generational GC.

We too have seen programs where a significant percentage of the data
survive until the end of the run.  (Ramps and big plateaus, as we
call them in our allocator survey).  But if you have a persistent
object store that's GC'd, those are data that typically become
garbage at the end of a program run, and can be collected any time
after the end of the run.

So if you use a persistent language with a generational GC (and enough
generations), you should see the generational principle working reasonably
well in most cases.  (I think.)  The big old persistent generation(s) will
be GC'd occasionally, and these objects will be reclaimed at the next GC
that occurs after the end of the run.

So it's not that there's anything wrong with the generational
principle here---or at least, there's no evidence either way.  What's
going on is that the using the file system for save/restore operations
obscures lifetime issues for persistent data.  If you had a p-store,
things would be simpler and clearer.  (But not necessarily better
for your GC.   Forcing the programmer to do malloc/free like management
of file data is convenient for the GC implementor in some ways, but
it's "cheating.")

>Anytime you read a file larger than 100Kbytes into some meory representation in
>the hypothetical system you described, you will get something like 100%
>promotion rate.  (This assumes essentially no temporary allocation in the
>process, which is likely for a well-tuned system.)

Right.  On the other hand, in many cases what's going on is that you're
just reloading data that are actually *old*, and if you had a p-store,
you'd just fault the data into RAM, and the'd be in an old generation 
already, from the GC's point of view.

When you read old data in from files and allocate new objects on the heap
to hold them, you're basically *lying* to the garbage collector, telling
it that old data are new data.  This foils the generational heuristic
quite effectively in many cases, where if you had a generationally
collected p-store everything would come out right.  Old data would get
paged out the p-store as old data, and paged back in as old data,
and the GC wouldn't get confused.

In other cases, you'd still foil the generational priniciple by making
copies of big data structures, in a mostly-functional style, rather
than updating in place.  It depends on both the problem and the
programming style, just like the effects of style on cache memory
performance.

My own belief is that the generational principle is probably usually
right most of the time, even for long-lived data, but that conventional
in-memory-data-vs.-file-data distinctions are a problem for automatic
storage management.   The data I've seen for file lifetime disributions
seem roughly consonant with the generational principle---or at least
not hostile to it---but those are dirty data.  You just can't tell
whether a lot of short- or medium-lifetime files are really things
that would have been updated in place or not.

>When I was using Cedar, GC
>performance tended to be most noticeable while building large in-memory data
>structures, so this is not something to be ignored.  It's one of the cases that
>a system should be tuned for.

Right.  Some programs really do build large, temporary data structures,
and build them in a hurry.

You should be able to get within some constant factor of optimal performance
with a GC that advances data relatively quickly but not too quickly---after
a few advancements, it'll be in a generation that's GC'd on the right
timescale.  But you'll be able to get much closer to optimal with tuning,
and probably with automatic adaptation.

A big problem with studying automatic adaptation is that we don't
have a lot of data for serious programs written in a "natural" environment
for programs---a persistent store.  All of the data for serious programs
is messed up by unnatural distinctions between transient and persistent
data, and the fact that you can't track object lifetimes when the
objects have been squirreled away in files.

>Experience with my conservative generational collector suggests that you often
>want to run full collections after < 10 partial collections.  (The current
>default strategy is to unconditionally run a full collection every 5th GC, if I
>remember correctly.  This is influenced by the fact that it's usually used with
>C code, the fact that it's a conservative collector, which tends to extend
>lifetimes of short-lived lifetimes somewhat, and by the simple tenuring
>heuristic, which speeds up collections at the cost of extra tenuring.)

I think the effect of languages is probably the most important thing
here, or at least the most generalizable.

Compared to persistent languages, C programs are more likely to be
programs with data that fits in memory, and "if it pages, it's broken".
The persistent data is all hidden in the file system.

For this kind of program, it's often not very damaging to do a full
GC pretty frequently---the program touches most of the pages
often enough that keeping it all in RAM is not much worse than just
keeping data that the program's touched lately in RAM.

Even for C programs, however, there are some that use virtual memory
much larger than physical memory.  For those programs, locality matters
a lot, and frequently tracing all of the live data is going to be a
big lose.  The GC will cause lots of paging, with I/O costs proportional
to heap size rather than the program's own locality.

For some programs, the right thing to do is to do full GC's frequently, and
try to keep everything in RAM.  For others, it's better to GC infrequently,
because most of the unreclaimed garbage won't get touched, and will
only cost swap space. 

For programs where you touch most of the data often enough that
it's RAM resident, GC only costs you a small constant factor in space
(due to the delay in reclamation) and a small constant factor in time
(due to the tracing costs).  (I wish I could quantify "small" :-) )

But in a sense, this is just because RAM is so cheap now that most of
our old-fashioned C programs have stopped paging.  For another $50
worth of RAM, most of them won't page with a GC either.  That's a great
thing, but the frontier is large persistent object stores holding
things like big object-oriented databases and 3D worlds for cyberspace.

For many applications with lots of persistent data, checkpointing
issues are more important than straightforward locality issues.
Coordination of GC with checkpointing may be just as important
as the overall memory usage and locality effects of the GC.

>Generational collectors are often a good idea.  Sometimes the can help
>tremendously.  But they probably need to adapt as much as possible.  They
>certainly need to be prepared for occasional failure; the performance of full
>collections also matters.

Agreed.

>Hans
>
>
>-- 
>Hans-Juergen Boehm
>boehm@mti.sgi.com
>