[gclist] References on GC and separate compilation

David Chase chase@world.std.com
Thu, 17 May 2001 19:46:28 -0400


At 05:30 PM 5/17/2001 -0400, Daniel Wang wrote:
>I'm looking references on the needed linker support to implement "pc
>maps". I'm sure people need similar tricks for exception handers.
>In particular if you are using a tagless GC that is using the return address
>as an index into a live roots table, how do you build a global table at link
>time from tables in separately compiled libraries or modules.

I believe you can do it with ELF.  I seem to recall that
Sun's C++ did it in a relatively ordinary way, using
two extra sections, plus some init code.

For each separately compiled module, stick the PC range information
(preferably encoded in a self-relative form) into a new section
called "PCRANGES".  Each entry looks something like

begin:  .word range_begin - .
length: .word range_end - range_begin
stuff:  .word (whatever data is needed)

The compiler ensures that these are in address order for
each module.

At link time, assume standard (no fancy block or procedure
reordering) section concatentation semantics.  So, given a
link of a.o, b.o, c.o, the linked text section contains
a.text, b.text, and c.test in order, and the linked PCRANGES
section containg a.pcranges, b.pcranges, c.pcranges.  That is,
the ranges are already sorted into the proper order on a
per-linked-unit basis.

Each linked unit is also bracketed by pcrange_begin.o
and pcrange_end.o, which are compiled from (roughly) this
assembly language:

pcrange_begin.s contains:

-------------------------
  .section PCRANGES
pcrb = .
  .section PCRANGE_BOUNDS
  .word pcrb-.
  .section init
< code to register the pcrange from its bounds>
-------------------------

pcrange_end.s contains:

-------------------------
  .section PCRANGES
pcre = .
  .section PCRANGE_BOUNDS
  .word pcre - .
-------------------------

That is, PCRANGE_BOUNDS contains

  .word beginning_of_pcranges - .
  .word end_of_pcranges - .

and the init code in pcrange_begin
can register the existence of all the
pc ranges.  It's also useful to have the
initialiation code (.init) occur in the
very first .o linked in (making it the first
run) because the other .init section might
require exception-handling support and/or
garbage collection.

I may have described this in the exception-
handling paper I did for JCLT some years back,
and what I have just outlined is more or less
what Sun did 8 years ago in their C++
implementation.

Now, that said, I should add that we didn't
do it this way in our Java implementation,
mostly because we felt that it was important
to be able to trace the stack extremely quickly.
However, we (NaturalBridge) are working with
a fully-custom linker and runtime, and only
worry about platform conventions at the native
boundary.

David Chase