[gclist] gc interface
Geert Bosch
geert@sun3.iaf.nl
Tue, 02 Apr 96 11:29:09 GMT
Bob Duff wrote:
`` Ideally, one would have a well-defined interface
between gc, mutator, and compiler. And one could
plug in different gc algorithms, different allocators,
etc. And everything would be plug compatible. Sounds
impossible, since gc is inherently a global issue. ''
Francois-Rene Rideau wrote:
`` It *is* possible, but not given current
state of language technology. ''
Both of these statements might be a little too strong. With
just a few limitations it is possible to make a simple framework
in which arbitrary garbage-collecting allocators can be
integrated. For the allocator that I wrote for Ada-95 I used
the following structure:
___________________
/ \
System / Global \
Specific \ Pool /
\___________________/
/ | \
- - - - - - - - - -/- - - - | - - - - - -\- - - - -
/ | \
________ ___________ _____________
User /Ref.Cnt.\ / 8-byte obj\ / Pointer-free\
Defined | Pool | | Pool | | object pool |
\________/ \___________/ \_____________/
\ | /
\ | /
\ | /
_\_________|_________/__
| |
| A p p l i c a t i o n |
|________________________|
1. Global pool
- There is one global pool of memory.
- This global pool only allows allocation of integer multiples
of the VM page size (eg. 4 kB), aligned on a page-boundary
- All memory allocated from this pool has a tag denoting it's type,
and a few extra bits that contain type-dependent information
- Because of this limitation the global-pool can be really simple
and straigforward.
- By default the global pool conservatively scans all allocated
memory for pointers and marks all destination objects
2. User-defined allocators
- use the global pool for allocating the memory which the user-defined
allocator will manage, either explicitly or implicitly using `new'
- The allocator defines it's own Initialize_GC, Finalize_GC,
Scan, Mark and Sweep methods.
- Set the type-tag in the global pool, so the user-defined allocator
will be responsible for its own memory.
When a user-defined reference-counting storage-pool is forced to
scan itself, it only has to scan for external references. If these
are known to be absent (because of type-information, for example)
the Scan can just be an empty method, or it can scan for cyclic
garbage.
When references from other `un-cooperative' storage-pools are
allowed, call's to Mark (between Initialize_GC and Finalize_GC)
can be used to update an `external-refence-count' field.
The downside of this inter-operability of different storage-pools
is that calls to 'Mark' become expensive dispatching calls. For
references within the same (type of) pool, the pool's Scan method
could use short-cuts of course.
In practise I think these effects can be optimized away for most
part. If all storage pools for small objects share the same code
(with just another Object_Size at initalization) and 95% of all
pointers point from a small object pool to a small object pool,
only the remaining 5% would have to use a dispatching call. For
the large part of pointers one highly optimized loop should be
used.
If these assumptions are true in practise (I think/hope they are),
performance should not be really bad.
3. Changes to compiler
These can be minimal. In principle, only the run-time library
has to be changed. Replacing 'malloc' and 'free' by functions
that displatch to the various registered pool-types based on
requested object size, would be enough: the allocated storage
will be managed by the global pool and by default the storage
will end up in a worst-case (conservative) pool.
One step more advanced is to have the compiler directly make
a call to the appropriate storage-pool based on the object type.
Note that Ada-95 already has a notion of user-defined storage-pools,
so that the user has control over which pool to use for which type
of objects.
Greetings,
Geert