[gclist] Re: gclist-digest V2 #58

Mark Hamburg mhamburg@Adobe.COM
Sat, 31 May 1997 18:03:25 -0700


>From: Jeremy Fitzhardinge <jeremy@zip.com.au>
>We've found in our Java implementation that after using a JIT to make
>bytecode execution reasonably efficient, GC performance outweighs most
>other factors for an interesting proportion of Java code.

Where was your test code coming from? One of the things that can be
relatively important to avoiding this sort of problem is being careful not
to allocate more often than necessary.

Unfortunately, because of the desire to make Java's bytecodes readily
checkable, it doesn't offer many features that would have made it easy for
programmers to limit their allocation rate -- e.g., nested objects, stack
allocation of objects, references, multiple value return, etc.. You can
work around some of these issues with explicit code, but I don't think the
issues are covered in any particular depth (if at all) in the Java
literature. The Oberon system includes many of these features and yet
remains safe and garbage collected. It also has performance competitive
with C. LISP and Smalltalk don't offer these features and generally have a
harder time competing with C from a performance standpoint. In exchange,
however, they offer things like closures and first class continuations.

So before going after garbage collector changes, one thing to look at is
coding changes. GC provides the illusion of infinite memory. This should
not, however, be confused with free memory.

If that fails, then my current thoughts about the best route for a Java
oriented GC would be to assume that a lot of objects will never be
referenced from the heap or globals. Keep a ZRC (zero-reference count) list
that lists all of the objects suspected of only being referenced from the
stack or registers and add a mark bit to each object that gets set when
storing a pointer to the object into the heap or into a global. We can
periodically cleanse the ZRC list by scanning the list and removing any
objects whose mark bits are now set. We can also compare the ZRC list
against the stack and registers and find any objects no longer referenced
from either place. These objects can be freed. The scheme needs to be
backed up with a real GC (e.g., a mark-and-sweep collector) but it should
reduce the frequency with which such a collector needs to be invoked.

At the expense of a bit more complexity and code on pointer storage, we can
change the mark bit to a counter that covers the range 0, 1, 2. This then
allows us to do recursive frees on objects for which only one pointer has
been stored in the heap. The counter isn't a true reference count. It just
satisfies the conditions that if it is zero, there are no references from
the heap, and if it is one then there is at most one reference from the
heap.

Either the first step alone or the two steps together provide a simple
"generational" scheme that is compatible with mark-and-sweep garbage
collection and hence more readily connected to environments like C where
copying collectors are harder to use.

Mark