[virtmach] Virtual Machines and Code Representation

btanksley@hifn.com btanksley@hifn.com
Wed, 26 Jul 2000 09:58:03 -0700

From: Luk Stoops [mailto:lstoops@vub.ac.be]

>Do you compare compressed parse-trees against uncompressed byte-codes ?
>Otherwise it seems difficult to compress more information into less

No.  "Compressed" bytecodes are simply compressed by a general compression
scheme (such as bzip2 or zip).  Compressed parse trees may be compressed by
domain-specific code, because they model the domain so well.

For example, rather than storing function pointers at the start of every
node, we know that the first element in every node of the tree is an
function, so we simply number the functions and store a huffman-compressed
integer at the start of the node.  There's not even any need for escape
codes, because the same thing is in the same place every time.

You can also huffman-encode the functions in bytecode, but because you never
can guess whether the next operation will be a 'call' or some other
instruction, you're forced to encode the call as well.

Also, the parse trees carry information about data flow and data types.
Therefore, you don't need to encode that a particular '+' is an integer
plus, because it's obvious from the types of its operands.  If you do that
with bytecodes, you have to either resolve the types dynamicly or compute
the dataflow to figure it out.