Python Specializing Compiler (fwd)

Armin Rigo arigo@ulb.ac.be
Sat Jun 23 01:18:02 2001


Hello everybody,

A posting of mine from the python-dev mailing list....

---------- Forwarded message ----------
Date: Fri, 22 Jun 2001 13:00:34 +0200 (MET DST)
From: Armin Rigo <arigo@ulb.ac.be>
To: python-dev@python.org
Subject: [Python-Dev] Python Specializing Compiler

Hello everybody,

I implemented a proof-of-concept version of a "Python compiler". It is not
really a compiler. I know perfectly well that you cannot compile Python
into something more efficient than a bunch of calls to PyObject_xxx. 
Still, this very preliminary version runs the following function twice as
fast as the python interpreter:

def f(n):
    result = 0
    i = 0
    while i<n:
        j = 0
        while j<n:
            result = result + (i&j)
            j = j+1
        i = i+1
    return result


Here it is --- comments and discussion welcome ! (Is it too early, or
should I consider making a PEP ?)

  http://homepages.ulb.ac.be/~arigo/python/psyco1.tgz


Extracted from the Readme file:

Psyco is *not* a compiler: it does *not* work by digesting a whole bunch
of Python scripts and producing an executable out of them. Instead, it
can only be run from within an already running Python session.

It works by 'specializing' the code of a function, as follows: let us
consider any function (let's say without side effects, that is, a function
that just computes a value out of its arguments without tempering with
global variables nor printing anything). There are several ways to write
machine-readable 'code' for this function. In standard Python, we always
use Python bytecode; this is a code that will work with whatever value the
arguments will actually be given at run-time. On the other hand, suppose
that you know the actual argument values; then the whole function can
theoretically be coded with a single instruction, which simply returns the
corresponding result. This 'code' is another version of the function, which
only works with one specific tuple of argument values.

Between these two extremes (bytecode working for any arguments, and a
single instruction working for a single tuple of arguments), what Psyco
does is trying to find a compromise. If we know that the arguments will not
take just any value, but only values from within a given set, then the
function can be given a much better 'code'. For example, the functions
'f()' in 'mctest.py' are obviously much more efficiently compiled if we
know that all arguments and variables are plain integers; if you translate
yourself these functions to C, you will have to tell exactly this to the
compiler, and you will get code that runs 1000 times faster.


---
A bientot,

Armin.