[gclist] Java manchines and when to collect.

Jeremy Fitzhardinge jeremy@zip.com.au
Mon, 15 Dec 1997 02:54:05 +1100

Mark Tillotson wrote:
> I can't let that go unchallenged - the motivating constraint behind the
> design of Java was to allow the bytecode format to be automatically
> verified cheaply at class-load time, and to prevent dangerously
> inconsistent class files from interacting to break the security model.

Sorry.  I probably went overboard denegrating Java's security
mechanisms.  However, as a Java VM implementor, I'm very aware of their
nature, but I'm also sick of people ascribing magical abilities to
Java's security mechanisms.

Sure Java has strong type safety, and that's vital to get anything done
securely.  The sandbox model is pretty weak though, and not especially
well thought through.  This is above the level of the basic language and
VM, and more the policies which Sun's VMs apply (that is, the sandbox
security model is just a set of policies implemented with the strong
type-safety of the language itself).

> 2) Completely determined and simple consistent semantics (although not
>    necessarily well documented!), so that load-time verification is
>    tractable, efficient and secure.  This is why the type system is
>    very simple, for instance.

So long as you ignore threads and asynchronous exceptions.

> 4) extra-linguistic features (native methods) treated as privileged
>    and rejected by the verifier.  They happen to be there to support the
>    "system" aspects of the JVM, ie "kernel mode", and implement
>    low-level hardware specific code.  They are not part of the
>    language per se.

Yep, but any implementation has to have some means of doing this (though
not necessarily extensible).  There's no particular reason the verifier
should prevent applets from using native methods: they just can't
download their own native code.

> | The type safety aspects of Java are distinct from the sandbox,
> Without type-safety the sandbox is useless, without sandbox the
> type-safety is broken (just load some native methods to do an
> illegal cast)
> To me this feels like type-safety and sandbox are intimately bound up
> together.

Perhaps we're misinterpreting each other's meaning of "sandbox".  It
looks to me that you're using it in the sense "everything that's pure
Java code" (i.e. excluding native code).  I'm using it in the sense
"everything in the untrusted security perimeter": that is, HotJava (the
browser) is outside the sandbox (or more accurately, implements it), and
any applets it downloads are within the sandbox(es).  Even so, the
applets must still be able to do things like talk to the window system,
if only in a very constrained way.

In the same way that Unix prevents processes from trashing the kernel,
the JVM's strong type safety keeps all code semantically well behaved. 
Like SUID, the holes in the sandbox still allow security breaches.

> In fact aren't there some interesting theoretical research issues in
> designing type-systems that can be secure, and yet give a language
> the power to write its own storage-manager/GC ??   (Getting back to
> the topic)

Yep.  I suspect you need a considerably more powerful type system than
Java's though, but I haven't really looked into it.  Are there any
languages which can implement their own storage manager?

> | Of course, merely having a typesafe language and a sandbox with
> | constrained accesses doesn't give you security.  Java implementation
> | have so far completely ignored the problem of denial of service attacks
> | which are much harder to deal with.
> Tackling denial of service, a less serious problem, doesn't require
> any key technology, merely the ability to monitor for suspicious
> over-use of resources, and to apply quotas for those resources.
> Frankly I suspect it hasn't been addressed much because it's rather
> dull to implement!

I think there's some quite interesting problems in this area.  The
problem is that as far as a GC is concerned, an object is "owned" by
whoever points to it.  If an applet allocates millions of objects, but
gets something else to hold references to them, it can cause lots of
memory to be allocated without looking responsible.  If you try to
constrain allocation by looking at rate, a hostile applet can merely
arrange for something else to allocate memory on its behalf.  And what
happens if an allocated object is pointed to by both a respectible
trusted piece of code and some nasty miscreant applet?

Determining resource consumption in terms of memory ownership is not
necessarily the job of the GC, but given it has global knowledge of the
heap's topology, it's in a good position to make a mapping from the
roots to actual objects, and therefore work out proportional ownership.

The problem is similar to trying to determine the physical memory usage
of a process using shared libraries: is it exceeding its memory limit if
attaches to already resident pages?  The obvious answers are neither the
fairest, nor (necessarily) the most desireable.

There's similar problems with determining an applets CPU consumption:
trading off a few CPU-bound threads against lots of mostly idle ones;
the problem of threads performing work on behalf of others.  I wonder if
there can be some unification of techniques for both memory and CPU
consumption management.