[virtmach] Re: Comments on Sun's JAVA VM Specification?

Peter Bertelsen pmb@dina.kvl.dk
Tue, 27 Jun 2000 16:58:26 +0200


Peter Murray wrote:
> 
> Since the people on this list have a wide range of experience with various
> VMs, I was wondering if anyone has coments on Sun's Java VM specification?
> What do you think are the good and the bad points?

I've been working quite a lot with the JVM and Java bytecode over the
past couple of years. On the positive side, its semantics is fairly
well-defined, it forms the basis of a highly advanced program execution
environment, it has been ported to almost any platform of common
interest, and the promises of seemless cross-platform program
development are beginning to be fulfilled. The first edition of the JVM
spec had quite a few serious problems (cf. http://www.ergnosis.com/jsr
[*]), many of which have been resolved in the 2nd edition. 

On the negative side, a number of problems still remain in the 2nd
edition of the JVM spec, e.g.:

 * Underspecification of operations on the `small' integral types; for
   example, what happens when a int stack operand is stored into a byte
   field--is the value truncated or not?
 * Verification is not well-defined, but formal specifications of this
   central part of the JVM appear to be on their way. Until then, the
   static and dynamic semantics of the JVM is not entirely clear.
   Did you know, for example, that interface types are not checked by
   the verifier in Sun's JVM implementation? Instead, /run-time/ checks
   ensure that the `static' type safety of Java is not bypassed...
 * Dependencies and the necessary level of integration between a JVM and
   the standard libraries that must accompany it are unclear. For
   example, initialization of the standard classes java.lang.Object and 
   java.lang.Class is complicated by the mutual dependencies between
   them. The normal procedure for class loading, verification, linking
   and initialization thus cannot be applied to these classes, yet the
   JVM spec devises no alternative treatment of them.

Although it can hardly be deemed a problem with the JVM spec per se,
Java bytecode is a mess of `optimized' instructions and more generic
instructions. The instructrion set seems to have been invented and
changed ad hoc, rather than designed carefully from the beginning.
This also shows up in the binary representation of instructions. For
example, an invokeinterface instruction, which is used for invoking
interface methods, includes two superfluous immediate operands: a
"count" operand (the number/size of method arguments, which is readily
available via the method descriptor) and an operand that must always be
zero!

Unfortunately, Sun seems reluctant to improve on the class file format,
instruction set and representation thereof, although the version number,
which is part of any Java class file, could simply be changed to avoid
conflicts with older JVM implementations. 

Another interesting feature of the JVM is that the class file format and
JVM instruction set is more flexible than the Java language itself: any
Java program can be expressed in Java bytecode (via lifting of inner
classes to top-level in the bytecode), but the opposite does not hold.
For example, a class declaration in a Java class file may include two
separate definitions of a field "f", provided they have different types,
e.g. int and Object[]. This is perfectly safe and sound at the bytecode
level, since each field reference includes the (expected) type of the
referenced field.

Despite the above problems and gotchas I find the JVM an interesting
subject. It lacks a more precise (preferably formal) definition, but
this is just a challenge for people like us!  :-)

Regards,
Peter Bertelsen (http://www.dina.kvl.dk/~pmb)
PhD student, KVL, Copenhagen


[*] An update to the Java Spec Report (http://www.ergnosis.com/jsr) has
been planned for a very long time, and would be highly appropriate; much
of the material is seriously outdated. Unfortunately, the editors (Roly
Perera and I) have been snowed under for a while...