UVM and Microsoft
Michael Korns
mkorns@ix.netcom.com
Sat, 17 May 1997 05:58:52 -0700
John,
These ideas deserve some discussion.
> >> Three: Please feel free to give us constructive criticism on our ABVM.
I
> know we will have to exend it if we wish to support Java, Smalltalk, etc.
> Adding Java support is easy because the JavaVM is already defined. The
same
> is true for Smalltalk. Can you easily define your additional needs? <<
>
> My main concern is that the UVM may have the potential to become obsolete
very
> quickly. If the code is constrained by compilation into terms of a static
set
> of low-level instructions, when processors with capabilities of executing
> higher level instructions arise - the UVM will no longer describe code
which
> can be compiled efficiently. Indeed, not only processors but whole
systems in
> diverse environments.
Another plus, you didn't mention, is that with a proper implementation one
looses nothing. So let's begin to discuss what a proper implementation
might be. Right away I have a question about your next proposal, which was:
> To get around this problem, I suggested the capability to *extend* the
> instruction set, and build this in as a feature of the UVM. The
instruction
> set may be extended to describe models in Java, for example - but
extended
> instructions will always carry with them both native (ie. 80386 code)
> implementations, and implementations of the base instruction set (ie.
ABVM).
> If the target platform is not provided (eg. if the platform were 68000
based),
> then it will rely on the native code associated with the aforementioned
ABVM
> instructions. I have called the library of extensions an optimisation
table,
> but we could sort out proper names later.
Our AgentBase Virtual Machine (ABVM) and the Java Virtual Machine (JVM)
deliver highly efficient compiled code precisely because the VM
instructions are constrained to a static set of low level instructions. In
fact these low level instructions are precisely those which modern CISC
cpu's can handle natively.
The JVM's instructions are 8bit Stack-like instructions containing an 8 bit
opcode followed by up to one in-line argument. Some examples are:
pushinteger source1
pushinteger source2
addintegers
popinteger target
These byte encoded VM's can attain runtime execution speeds, under
emulation, on average less than 15 times slower than C code; and, after
compilation, equal to C code.
The ABVM's instructions are 32bit Quad-like or RTL-like instructions
containing a 32 bit opcode followed by up to three in-line arguments. Some
examples are:
addintegers source1 source2 target
divreals source1 source2 target
jmple source1 source2 branchIpValue
These newer word encoded VM's can attain runtime execution speeds, under
emulation, on average less than 5 times longer than C code; and, after
compilation, equal to C code. Word encoded VM's perform better under
emulation because of the number of instructions is greatly reduced. For
instance, on average, it takes four times more instructions in a byte
encoded VM to add two numbers:
BYTE ENCODED VM WORD ENCODED VM
pushreal source1 addreals source1 source2
target
pushreal source2
addreals
popreal target
A problem arises when we try to extend a word encoded VM with object
methods calls. When we do this we get a degradation in run time speeds
under emulation but not after compilation. This is because function and
methods arguments cannot be handled in-line. For instance, we get something
like the following:
push source1
push source2
call extendedOperation
push target
Notice that this automatically reduces the word encoded VM instruction
stream to the inefficiency of a byte encoded VM. So how to solve this
emulation efficiency problem? You suggested:
> I'm pretty convinced that objects, interfaces and multiple implementation
> inheritance provides a model which can quite accurately describe a
> comprehensive set of abstractions. I suggest, if it does not already, to
add
> support for interfaces and some form of rich derivation to ABVM.
Well, let's examine this. I'm trying to understand your idea. Each 32 bit
VM opcode is broken into four bytes as follows:
opcode inlineArgModifier1 inlineArgModifier2 inlineArgModifier3
This yields 256 possible opcodes, and 256 possible inline argument
modifiers. These restrictions are almost never a problem since we are
dealing with a low level instruction set by design. If it gets too big, we
should be using function calls, and we probably haven't targeted the
hardware instruction set properly.
So it would seem simple to use your idea of extending the word encoded VM
by supplying loadable emulators (a new Class for each emulator i.e.
instruction set). Then we get the following:
extendedOpcode source1 source2 target
If we use your idea and allow the emulator to be loadable or assignable by
the VM itself, we get all the benefits of reflexive design. So far, if I'm
following you correctly, it sounds good.
For compiled code generation, both the byte encoded and word encoded VM's
are equally efficient. You mentioned translation tables. I prefer rule base
semantic analysis production systems. Since the database query language in
AgentBase is Lisp, this is not a problem for us. It is easy to construct
rules recognizing patterns of instructions to be translated into native
binary bit streams.
So my final question involves the front end semantics and compilation.
Potentially we now have multiple languages with multiple contexts,
semantics, etc. How do we handle language to VM compilation. Do we try to
compile any language to all VM's? Do we restrict compilation to language
and VM pairs. If so, do all VM's have to be synchronized to have the same
VM instruction (table, object, etc.) swap opcode? Am I understanding your
proposal, or should I take more LSD?
----------
> From: John Wood <tenshon@msn.com>
> To: Michael Korns <mkorns@ix.netcom.com>
> Cc: Bill House <bhouse@dazsi.com>; Gilda Cabral <gcabral@dazsi.com>
> Subject: RE: UVM and Microsoft
> Date: Thursday, May 15, 1997 5:39 PM
>
> >> Three: Please feel free to give us constructive criticism on out ABVM.
I
> know we will have to exend it if we wish to support Java, Smalltalk, etc.
> Adding Java support is easy because the JavaVM is already defined. The
same
> is true for Smalltalk. Can you easily define your additional needs? <<
>
> My main concern is that the UVM may have the potential to become obsolete
very
> quickly. If the code is constrained by compilation into terms of a static
set
> of low-level instructions, when processors with capabilities of executing
> higher level instructions arise - the UVM will no longer describe code
which
> can be compiled efficiently. Indeed, not only processors but whole
systems in
> diverse environments.
>
> To get around this problem, I suggested the capability to *extend* the
> instruction set, and build this in as a feature of the UVM. The
instruction
> set may be extended to describe models in Java, for example - but
extended
> instructions will always carry with them both native (ie. 80386 code)
> implementations, and implementations of the base instruction set (ie.
ABVM).
> If the target platform is not provided (eg. if the platform were 68000
based),
> then it will rely on the native code associated with the aforementioned
ABVM
> instructions. I have called the library of extensions an optimisation
table,
> but we could sort out proper names later.
>
> I'm pretty convinced that objects, interfaces and multiple implementation
> inheritance provides a model which can quite accurately describe a
> comprehensive set of abstractions. I suggest, if it does not already, to
add
> support for interfaces and some form of rich derivation to ABVM.
>
> Also - we should look at a persistence spec for the VM, which I believe
should
> be a totally separate spec. It should have the ability to map store
bytes
> onto the optimisation table - and not necessarily in one way. For
example, we
> can have one persistence model which is directly compatible with the
JVM's,
> and another which is compatible with ABVM's existing spec. Perhaps at
one
> stage we may even be able to provide something which can JIT compile
80386
> code on other platforms.
>
> These are my suggestions - what do you think?
>
> John
>
>