[gclist] tail-calls in C/C++
David Chase
chase@centerline.com
Tue, 06 Aug 96 11:10:29 -0400
> From: Greg Morrisett <jgm@CS.Cornell.EDU>
> Subject: [gclist] tail-calls in C/C++
> Why hasn't someone hacked gcc to do full tail-call elimination?
> With the Boehm-Weiser collector, this is really the only thing
> that's keeping me from generating very good C code from ML and
> Scheme.
> For the Sparc, I resorted to writing a simple assembly routine,
> tail_call(), which takes a function pointer and n arguments,
> pops the current stack frame by doing a restore on the register
> window, and then jumping to the function, shuffling the arguments
> appropriately, and throwing away the return address. The register
> window restore takes care of re-establishing the sp, fp, and
> return address, making this quite simple. Whenever I have a
> tail-call to an unknown function, I simply call this routine.
>
> The performance of the resulting code is quite good (on par or better
> than some well-known native-code compilers), and the collector sped up
> quite a bit. (Side note -- this is the first time I've really found
> register windows to be useful :-))
> What I'm wonder is (a) is gcc amenable to such a change and
> (b) if so, has someone already gone to the trouble to hack this in?
> It seems like this "simple" change could have a fairly significant
> impact on the use of C as a target language, now that conservative
> collection technology is working so well.
I'm not certain that it is always possible to do the tail-call
transformation. Someone else provided an example of how caller-pops
can mess things up. A variant of this occurs on all RISC machines
that I know of when more parameters are passed (in the tail-call)
than will fit in registers. The additional parameters must be stored
in memory, typically in the caller's stack frame. The only place
that the tail-caller can use for allocation is its own stack frame,
and not its callers. (F calls G tail-calls H, G cannot allocate
memory in F's frame, so G must not discard G's frame before calling
H).
Even when it is possible, certain things that you can legally write
in C *should* inhibit tail call elimination. For instance, the
following tail call cannot be eliminated.
foo() {
int x;
bar(&x);
baz();
}
Here's why this cannot be eliminated:
int * gpx;
void bar(int * px) {gpx = px;}
void baz() { printf("*gpx = %d\n", *gpx);}
Another complication arises in C++, if you are using exception handling
and have any objects in the local frame that need destructors run
upon frame exit (or throwing of exception). In such a subroutine,
there are no tail-calls (without extremely thorough analysis of the
content of the destructors), even though some calls may syntactically
appear to be tail-calls.
speaking for myself,
David Chase