[virtmach] parrot VM and register vs. stack machines
Kyle Hayes
virtmach@iecc.com
Sat, 3 Aug 2002 16:28:47 -0700
Wow. Interesting discussion and lots of nice pointers to data I didn't
know about.
I was able to glean some information about the Parrot VM from some of the
papers (particularly the "assembly language" one).
It appears that instructions are roughly in a 3 argument form for those
that need that many arguments. From the size of various constants noted
and some examples at the beginning of the "assembly" document, it appears
that there are four groups or types of registers: integer, floating point,
string and PMC (whatever that is). The registers are designated in the
format <type><reg#> where <reg#> gets 8 bits in the examples and type has
values 0x1, 0x2, 0x4, 0x8. No others are shown, so I assume that the type
field needs at 4 bits. The example shown notes that N19 (floating point
register 19) is designated by 0x413. So, 12 bits per register. If Parrot
actually has "bytecode", it will be quite wide.
There are opcode that push and pop frames of registers (different for each
type of register). It wasn't clear that you'd push all 256 registers of
a given type. That seems like a lot of data.
As to the "discussion" of register vs. stack machine, I got this much out
of it:
1) for an interpreted VM (i.e. not one that is designed for JIT without
interpretation), it sounds like there isn't a concensus. There are claims
that register-based VMs are faster (Lua, Dis). Lua's speed up appears to
be from reducing the number of opcodes needed to execute something. At
least with Lua, the rest of the language stayed the same, so there is some
hope of doing a meaningful comparison.
2) Stack VM code has some nice properties like easier implementation of the
compiler (no register allocator) and perhaps the interpreter. However, as
the Lua example points out, it may result in more code. With a very
efficient interpreter (perhaps using some threading model), it isn't clear
to me that the overhead of decoding a more complex instruction like Lua's
would necessarily be faster than running more opcodes.
3) several people said that you'd want to use an "infinite register" VM
model if you went a register VM. I am not sure I understood the
motivation for that, but it seems to be because this is what many compiler
backends use as input (with SSA etc.). Thus, it would be possible to
apply known compiler technology to the VM to have it do decent native
hardware register allocation.
To me, if you are interested in performance, you'll need to compile to
native code or use a good JIT compiler. Either way, the representation of
the VM isn't driven by any execution requirements. It should be driven by
ease of compilation (JIT or otherwise). Rather than focussing on getting
a 20% speedup (the Lua claim), perhaps the effort would better spent on a
simple JIT that could give a much greater speed up.
I know of Anton Ertl's Vmgen package (any more docs in English yet?). Are
there any that provide the ability to specify VM semantics and generate
both interpretive code and code to support JIT compilation?
Best,
Kyle