Why Bytecode?

Eric W. Biederman ebiederm+eric@npwt.net
07 Feb 1998 12:15:34 -0600


>>>>> "FR" == Fare Rideau <rideau@ens.fr> writes:

FR> Dear Tunesmiths,
>>> : Maneesh Yadav
>> : Eric Biederman

>>> OK, this may be a stupid comment, but I can't figure out a good answer.
>>> Why is that VM's like java impliment their thing with a universal
>>> bytecode?
>> 
>> Efficiency, and portability.  A byte code is a no brainer to
>> interpret, just do a switch on the byte code, and go to the right
>> place to interpret that.
>> 
FR> I agree with the latter, but not with the former.
FR> Bytes are neither efficient nor portable, unless your measure
FR> of efficiency is time taken to code the interpreter:
FR> bytecodes are too low-level for good compilation,
FR> not high-level enough to amortize interpretation overhead;
FR> too small to allow good/easy expansion,
FR> not suited to good compression.
FR> Plus modern architectures may make extracting bytes from memory inefficient
FR> (especially on word-addressable architectures,
FR> especially if you have to colesce non-aligned data back into words,
FR> especially when word size is not multiple of 8).

On word-accessible architectures, modifiying memory in byte size
chunks can be a problem.  But bytes are portable.  Almost all
architectures support them in some fashion.  The trick on
word-accessible architectures is to figure how bytes are accessed
efficiently.

As far as compilation, bytecodes were never designed for that.  JIT is
a fairly new technology that people haven't completely explored yet.

FR> All the serious experiments done show that bytecode just doesn't buy it
FR> in terms of either performance or portability.

Wrong:
Bytecodes are a good compromise of efficiency and portability.

An experiment done with forth using different threading mechanisms,
showed several mechanism were more efficient than a giant switch
statement, but there were none more efficient than a giant switch
statement that could be coded portably in C.

Using computed goto's of gcc they were able to make their test
portable, but you can't count on gcc always being there.

FR> See how OCaml gained a factor 2 (approx.) in performance on several
FR> architectures by using wordcodes, where Caml-Light used to use bytecodes.

They probably could have gained the efficiency with byte codes by
holding a register with the current word, and shifting it to get the
next byte.  And only reading from memory when their register ran dry.

FR> Bytecodes can be justified indeed, in either by quick-and-dirty approaches,
FR> or when runtime memory saving are needed, because it's expensive
FR> (embedded software, OpenBoot PROM), or to avoid swapping (see clisp vs ACL).
FR> Even then a judicious choice of bytecode is not easy,
Agreed.
FR> unless you have a set of tools to consistently modify the bytecode system
FR> and retarget the bytecode generators (in an optimizing way).

These last comments come from ideas/research I'm not familiar with.

>> Also basic optimizations can be done ahead of time.
FR> Bytecode doesn't make that easier than any representation!

It makes it more efficient than a source interpreter, which cannot do
that in a trivial implementation.  The things you can do are the basic
optimizations of: 
loop strength reduction, constant propogation, common subexpression
elimiation, and redundant instruction removal.

Those are some of the optimization with the highest payoff, so it does
help.  Not as much as if you were targeting native code howwever.

And of course over an classic interpreter you save parsing and lexical
analysis, which are not cheap.  The parsing especially typically
requires a second pass to handle inherited attributes.  But all that cost
you can save with a simple syntax (lisp,forth) instead of bytecodes.

FR> Just any code representation allows for peephole optimization;
FR> bytecode tend to be too low-level and make high-level optimizations
FR> impossible in the backend.

Which optimizations are you refering to?
And again, bytecodes are not designed to go into a compiler
necessarily so it's not terribly important.

And bytecodes as far as pure compilation should be no worse than a
normal high level language, for compilation (there should be no self
modifying code allowed).  But it is notrivial to compile just as it is
non-trivial to compile a high level language.

>> Besides I haven't seen a `universal' bytecode.
FR> What do you mean by "universal"?
FR> The JVM claims to be. The Caml-light VM has long been.

I guess what I meant was well suited to all languages,
instead of functionally complete.

>>> Why not concentrate on a VM with a readable language with a JIT and save
>>> space by leaving it up to a compression algo.?
>>> In order to prevent the code stealing, mangled sources could be used...
FR> Yup. All that's precisely exactly what Juice does
FR> http://www.tunes.org/~tunes/doc/Review/VMs.html#Juice

I'll have to look someday when I have some time.

>> Code's copyrighted so why do you need to worry about code stealing?
FR> I also agree; still code mangling will be a "plus" to convince paranoïds
FR> to use Tunes anyway.

So they can mangle their source.  Ugly 1 character variable names, and
no comments, or extra whitespace is about as bad as compiled code.

I wonder if you surveyed the code of people who are strongly against
releasing their sources, how often you would find they are not
paranoid about checking for errors.  

If so they don't seem wholly paranoid to me. :)

>>> I realize that a bytecode is good since it resembles a processers
>>> instruction set nad is small.  But why restrict a language to only work
>>> with today's processing paradigms (who knows what's next).  Surely a
>>> good HLL leaves more room for the future.
>> It's expedient, simple, and well tested.  When that future comes you
>> upgrade your virtual machine.
FR> It might be expedient, but it's neither particularly simple
FR> (try to be simpler than a LISP interpreter!)

A forth interpreter is simpler.  You use vectors rather than linked
lists for your code, a noticeable performance gain.

But bytecodes are simple even if they aren't the simplest.

FR> , nor particularly well-tested
FR> (why should bytecodes be intrinsically better tested than anything???).

The techniques of using bytecodes are well tested.
Techniques such as JIT are not.

I have not argued that bytecodes are intrinsically better.

FR> And we all know that if you make your bytecode a standard way to
FR> transport code (see the JVM), then you're casting into stone your
FR> bytecode specification and you can't upgrade. A high-level representation
FR> makes upgrade much easier.

Only temporarily.  Significant changes shouldn't be needed for years,
at which point you can bump you version number and have a different
representation.

Standards move like glaciers I admit, but once you have the basic
capabilities decided upon that's not a bad thing.  And they _do_ move.

>>> Doesn't bytcode just put an unessecary step in between
idea-> language->program?
>> No.  It just specifies the form.
FR> Yes, it is, in the above diagram, where programs are executed directly.
FR> But remember that bytecodes and the like are to be used to transport
FR> programs in a portable way; a framework of standard transport representations
FR> is necessary here; the fact that these representation be bytecode is
FR> unnecessary, and a bad idea. Again, see the Juice way of doing things.

Bytecodes are not a bad idea, although I wouldn't necessarily
reccomend them for tunes.

Bytecodes are a sound design choice in a lot of ways.  But it has not
been my intention to recommend them, merely to explain them.


As an aside:

I have been working on a compiler (in my compiler construction
class), and it is my intention to have as many front ends as possible
-- c, forth, scheme, c++, etc.  Although not all of them will be
completed for my class.

It is my hope that the intermediate language of the compiler will
become a better portable assembler than C itself.  That plus the
addition of a good API for runtime generated code would provide the
basis for a lot of things.

p.s.
Never forget the number one question of research.
Have the acheived their gains by simplifying their problem in an
manner that isn't acceptable to the real world?