[gclist] Sharing GC resources between applications

Mon Mar 21 19:58:32 PST 2005

Those are all good questions.  The answer is that some work has been done,
but the state of both research and practice is still lacking.   You've
alluded to the two major resource problems: code and data.  I'll (somewhat
long-windedly) address both from my own (biased) perspective.

For all of Java's virtues, with respect to code sharing, it threw onto the
trash heap lessons that have been learned repeatedly over the course of 30
years of operating system and language design.  To wit, that direct
disk-to-memory mapping and read-only sharing of code (and constant data) are
absolutely critical to performance, scalability (both up and down),
reliability, maintainability, etc etc (see
http://www.opost.com/dlm/tenex/hbook.html).  This is what allowed a 1981-era
DEC-20 to support 60 simultaneous users on a machine with 18MB of RAM.

The DEC-20 also ran a bytecode-interpreted, garbage collected language:
PDP-10 TECO, which was the implementation language for the original Emacs
(rather than Lisp).  It had a facility for loading a whole host of TECO
libraries and then "dumping" a binary image.  This meant that a system with
a large number of loaded libraries could be saved to disk and mapped
directly into memory, so the startup time for the first user was the time to
page in the application from disk, and for subsequent users the time to
memory map those pages in shared mode.  Startup times were vastly faster
than for systems like Eclipse which must load, re-format, compile, and link
every class individually -- despite the fact that the CPU was roughly 1000
times slower.

The unfortunate implications of Java's completely dynamic architecture which
did not allow sharing became obvious as soon as attempts were made to use
Java for significant applications running outside of web browsers.  Java's
class loading semantics require that a set of classes (which logically
comprise a library, package, or application) are loaded in the dynamic order
in which they are referenced.  This severely inhibits the creation of a shar
able compiled version of a set of classes, which is only exacerbated by the
nature of JIT compilation.

At IBM this problem hit most severely in mainframe transaction processing
environments, where the transaction semantics required that individual
transactions did not interfere with each other -- which meant that logically
they should run in their own JVM, but of course throughput requirements
precluded such a horribly inefficient approach.   The response was a
facility that essentially provided two heaps, one for per-transaction
objects and one for shared classes and data (see
http://www.research.ibm.com/journal/sj/391/dillenberger.html under "A
Scalable Java Virtual Machine Implementation").   In essence, this was quite
similar to the "dumped Emacs" approach I alluded to above, although the
shared image was initialized dynamically rather than being memory mapped.  I
was involved in the early design of this system -- it was an expedient
solution to a very severe problem, but isn't nearly general enough.

Isolates (JSR 121), as Dave Detlefs mentions, are an attempt at a more
general solution to this problem.  It's been some time since I looked at it
in detail, but I believe they suffer from a lack of pure semantics due to a
desire to allow multiple implementation styles at the whim of the underlying
system (either process forking, or multiple isolates per operating system
process, etc).   When isolates are implemented within a single process, they
will probably use fewer resources, but will sacrifice another kind of
isolation: fault isolation.  If one isolate does something that causes the
JVM to fail, all other isolates will crash as well, which is a serious
drawback.

John Corwin, Dave Grove, Chet Murthy, and I worked on a more fundamental
solution to the problem with MJ
(http://portal.acm.org/citation.cfm?doid=949305.949326), a module system for
Java that provided clearly delimited loadable units (note that neither
packages nor jar files do this) and strong isolation properties between
them.  In addition to creating modules that could be pre-compiled and
memory-mapped, it solved a bunch of problems: "classpath hell", multiple
instances of different versions of a library within a single JVM instance,
etc.  MJ was designed to fit into the existing Java language, so in some
cases it sacrificed cleanliness of design for compatibility -- but overall
it worked pretty well.  Unfortunately MJ has yet to see the light of day as
either a product or an open source release.

The assemblies of .NET are a similar approach to the same problem, and
essentially yield DLL's.  However, you need only fire up Adobe Acrobat
Reader (and get yourself some coffee) to get a sense that this is still far
from solving the problem.

In terms of heap data, there are a number of issues that arise.  First of
all, garbage collected heaps are inherently over-provisioned.  An
application will typically run in twice its theoretical minimum heap size.
When that over-provisioning gets multiplied across many simultaneous JVMs,
it adds up fast.  Phase behavior in which heap requirements go up and down
amplify this effect -- essentially a kind of macro-fragmentation across
JVMs.  One approach to this problem would be to have a "rotating to-space",
where garbage collection is coordinated so that it is never done by more
than one JVM at a time (or some fixed number).  The memory is collected into
the to-space, and then the old heap is freed and unmapped and becomes the
to-space of the next JVM to be collected.  I thought someone had tried that,
but can't recall now -- perhaps another gclist'er can comment.   At all
events, the engineering will be a major undertaking.

The work that Emery Berger mentioned on getting the JVM and the virtual
memory manager to cooperate is another important piece of the overall
solution.  Right now the real-world "solution" to the problem is to run on
machines with huge physical memories.  This is only tolerated because RAM is
relatively cheap.  The code space problem is in some sense more urgent
because it manifests itself in both space *and* time (loading).

In dealing with systems with thousands of processes, such as James Hague
describes for Erlang, you sometimes confront the classical problem of
functional language implementation: to make it efficient you have to
"re-discover" the imperative nature of the underlying program.  In
particular, Erlang implementations that use private heaps copy parameters by
value.  So you can wind up with thousands of copies of the exact same object
across many processes, rather than sharing them.  If those objects are
large, and there are a lot of them, then it's a big problem.  A significant
amount of memory is also lost due to fragmentation caused by having so many
small heaps.  That being said, Erlang is an extremely elegant language and
the approach has much to recommend it, precisely because resource properties
are localized to a process.

The alternative is to have a shared heap and an incremental or real-time
garbage collector.  This is the approach that Forsyth mentions for Inferno,
and that was taken by Sagonas and Wilhelmsson for Erlang
(http://portal.acm.org/citation.cfm?doid=1029873.1029875).   I believe that
neither of those systems defragment the global heap, though.  This isn't
inherent; but once you bite of the fruit of the shared heap, it's something
that has to be done for a truly complete solution.  The Metronome real-time
garbage collector that Perry Cheng, V.T. Rajan, and I created does
defragmentation in real-time -- but it comes at a cost in both run-time
overhead (about 7%) and in significant implementation complexity
(http://portal.acm.org/citation.cfm?doid=604131.604155).

Ultimately, the code sharing has to be addressed by language design and the
heap space issues by run-time system techniques.

david
----- Original Message ----- 
From: "Robin Boerdijk" <robin_boerdijk at yahoo.com>
To: <gclist at lists.iecc.com>
Sent: Friday, March 18, 2005 8:52 AM
Subject: [gclist] Sharing GC resources between applications

> Hi,
>
> I have a few questions about GC technology which requires a bit of
> introduction, so please bare with me.
>
> What I have noticed is that more and more applications are being
> written for the Java and .NET platforms. Assuming this trend continues,
> users will soon want to run several Java and/or .NET based applications
> simultaneously, as they are currently doing with their existing
> applications.
>
> I have also noticed that Java and .NET based applications require a lot
> of system resources, among which memory for their garbage collected
> heaps. My concern is that all these simultaneously used garbage
> collected heaps constitute a very inefficient use of memory, especially
> if the applications involved process large objects such as documents
> (HTML, XML, PDF), spreadsheats, etc.
>
> My questions are:
>
> 1. Is this a valid concern?
>
> 2. Is there any research activity going on to address this concern?
>
> 3. What could be a potential solution for this problem? For example:
> Would it be possible to share a single garbage collected heap between
> processes? Or could garbage collectors in separate processes be made to
> communicate so that one would free up some memory for use by the other?
>
> Any answers/views/pointers appreciated.
>
> Robin Boerdijk
>
>
>
> __________________________________
> Do you Yahoo!?
> Yahoo! Small Business - Try our new resources site!
> http://smallbusiness.yahoo.com/resources/