Misc ideas & comments

Matthias Hoelzl tc tc@gauss.muc.de
28 Mar 1998 02:13:16 +0200


Rodrigo Ventura <yoda@isr.ist.utl.pt> writes:

>         1. I tried RScheme, but it seemed to me too complicated, at
> first in a first glance. After compiling everything, the whole tree
> ocupied about 70MB of disk space. I had some difficulty to find the
> main executable in the dense tree structure. I had to use "find
> . -perm 111 -type f" to find it! But of course, this is my ignorance
> speaking... The bottom line is that RScheme might be too complex to
> allow a fairly good degree of exploration. And by the way, does
> RScheme uses a virtual machine?

This is my impression as well.  I looked into the RScheme sources
because I was looking for a Scheme system to which I could add a nice
MOP based object system.  In RScheme this is somewhat tedious to do
because you have to update the Scheme code and quite a lot of
hand-written C as well.  Furthermore my (admittedly not very thorough)
experiments with RScheme 0.7.1 indicated that performance was not too
hot, even for compiled code.  (RScheme can compile to C via the rsc
module compiler, however you have to `make rsc' in the "src" directory
to generate rsc, furthermore linking modules compiled with rsc v0.7.2
to the main executable fails on my Linux/glibc system.  The default is
to generate interpreted byte code.)

>         2. Portability is a nice feature for LispOS. The easier way to
> accomplish this is to make a platform-independent virtual machine. Of
> course there are other paths, but they seem too complex to me
> (gcc-like compiler).
> 
>         3. Is the gcc's RTL processor-independent? If so, how about
> using the RTL-down part of gcc to make LispOS? What is the greatest
> difficulty of compiling scheme? The macro scheme? The need to stuff an
> eval function with the compiled code?

RTL is not processor independent.  However gcc has the so called
"tree" data structure, a somewhat higher level interface to the back
end which is the right interface to hook a new front end to.  You may
want to read the header files "gcc/tree.def" and "gcc/tree.h".
However the tree interface is not sufficient on its own, you need to
generate a moderate amount of RTL as well.  But this does not seem to
be the main problem.  Neither do I think that macros are very
difficult, because they could be handled as a source-to-source
translation by the front end.  I think providing `eval' in itself is
not a big problem since the performance of `eval' should not matter
much, however allowing `eval' to change bindings seems to make many
compiler optimizations impossible or at least very tricky.  For my
own part, I wouldn't mind a compiler that restricts `eval' somewhat
more than the R5RS report.  The three most important problems that I
see when trying to interface a Scheme front end to the gcc back end
are

1) As far as I can see there is no simple way to generate tail
calls in the front end.  Gcc does some tail call elimination in the
back end, but not enough to be satisfying for a Scheme compiler (as far
as I can figure out it only optimizes self-tail-calls).

2) The implementation of the stack handling needed to have efficient
first class continuations seems very difficult to integrate with the
gcc back end.  The possibilities that I see right now are a
setjmp/longjmp based stack copying approach, suitable only to use
continuations for error handling; or managing your own runtime stack,
which would somewhat defeat the aim of using the gcc back end.  The
cleanest approach would probably be to define some additional tree
codes with corresponding implementation in the gcc back end.  However
this would be a largish undertaking, and since I'm no proficient
C-hacker I don't think that I can handle this without a lot of support
from people more familiar with both C and the gcc back end.

3) Would it be possible to compile Scheme function calls to be
compatible with C calls so that it would be easy to interface to C?  I
think that calling C from Scheme would not be a problem, but that it
would be necessary to create stubs for Scheme functions that can be
called from C because you cannot simply stack-allocate all parameters
to a Scheme function.  However this seems a minor problem, since
nobody will be calling many very small Scheme functions from C.

I will investigate these points some more and then ask the people on
the egcs mailing list whether I am on the right track and whether they
would be willing to help with such an undertaking, however I am a bit
sceptical.  Answering my (undoubtedly many) question about the gcc
back end would mean spending a lot of their time to help with a scheme
front end to gcc and I suppose they are more interested in improving C
and Fortran, since these are the languages they actually use
themselves.  Furthermore I am not really certain whether I want to
spend that many hours writing a compiler that nobody will be
interested in anyway.  I'll keep you updated about the results of my
inquiry and my future plans if you are interested.

Note that these are the problems that I see specifically for writing a
gcc front end; all the other hard problems in compiling Scheme or
Common Lisp to efficient machine code remain, of course.  

Actually compiling Scheme is rather easy, a simple Scheme compiler
written in Scheme is pretty straightforward (see SICP, ch. 5 for an
example).  The hard part is to compile to efficient code because you
need to find out which variables do not escape so that you don't have
to allocate them on the heap, which variables can only hold a certain
type so that you can unbox them, which functions you can compile
without templates and environment pointers, where you can lambda-lift,
and so on.  The problem is that many of these optimizations can only
be done by a globally optimizing compiler; the good news is that the
Stalin compiler demonstrates that it is indeed possible to generate
good code from Scheme source.

>         4. The experience I made with scheme48 ontop linux used a
> statically linked scheme48vm. But maybe we could use dyn-load
> explicitly. How about considering the dyn-linking mechanism a part of
> the Linux kernel? It could make things easier for LispOS. Or maybe
> not, because scheme linking is probably more demanding than plain
> C-style linking. Maybe we'll need to use a lisp-prepared dyn-loader.

Some Scheme interpreters (elk) seem to be able to get quite far with
dlopen, I don't know whether it is flexible enough for all needs.
However I mainly want a Lisp system that allows me to develop fast and
small stand-alone-applications with easy integration with C/C++
libraries (currently I can either get fast and large by using a Common
Lisp system, or small and slow by using Scheme); I realize that I am
quite alone with this in the Lisp community.  For my aims a batch
compiler where I can link in an interpreter to do the development work
and as a scripting language would be good enough.

  Matthias