[gclist] precise access barrier with hardware dirty bits

Francois-Rene Rideau fare@tunes.org
Fri, 9 Jul 1999 00:50:00 +0200


On Thu, Jul 08, 1999 at 02:12:53PM -0700, Bob Kerns wrote:
> There's another issue here that I think has been overlooked.
It hasn't been overlooked. It's been the debated cost
of the proposed technique all along.

> Virtual pages aren't free -- specifically, the OS data structures required
> to track the mapping consume memory. Often, the memory is not pagable --
> i.e. real, physical memory. (I don't know how often user-level page maps are
> pagable in current systems).
It is indeed an issue how you organize your datastructures
to avoid consuming too much memory, and especially "real-time" memory.
Well, I could imagine a hierarchical system of low-level memory mappers
that implement other higher-level memory mappers.
It's ok as long as your system is well-founded and
escapes circularity problems.
So if we observe that logical page structures have non-negligible cost,
we can make it swappable and handle them with a lower-level physical
page handler.

> In extreme cases of small objects and large, non-pagable page map
> structures, you could not only increase your memory usage by several times,
> but make most of it non-pagable. It could be cheaper to just keep a copy of
> the page and find the dirty objects by comparing against the copy! (That
> doesn't work for read barriers!)
I believe this technique is actually used by some (Texas?).
All the more since a copy is often necessary (until flushed)
for checkpointing purposes anyway, and
it allows differential compression if you're using a log-based store.
I feel this technique can be used quite complementarily
to VMM-based object separation.

> I'm also a little unclear on just what problem is being solved.
It is precise access control to memory resources,
which includes read barriers and write barriers.

> Trap handlers usually know what address is being trapped on,
Other people on the list who are more familiar than I
with the variety of available hardware might tell if it's always the case.
Anyway, dirty bits (on some barriers) allow for trap-less write-barriers,
and are certainly page-granular only. Actually, one of the motivation
behind my idea was to do achieve arbitrary precise (at implementer's control)
granularity in hardware-assisted automatic memory management
without having to resort to software bitmaps (or bytemaps,
as used by Urs Hoelzle).
It looks like the technique is also useful at trap-level.

> so they know what object was being referenced
> -- assuming you can find objects from addresses within the object.
Often, finding the referenced object means making
a lookup in to page-associated tables anyway,
then differentiating the subobject within the page
depending on page-wise data.
Having one logical object per page makes this second step trivial.
On big objects you lose nothing.
On small object, you gain.
It means that the GC-related data won't be in headers,
but in colocated page tables.
That may be a win when doing (mostly) non-moving GC of pointer-less data:
you needn't page-in those pages of data, you only touch metadata tables.

Actually, I could even imagine schemes where you map a same page of
"dummy" zero (or non-zero) data that stays in physical cache,
and use the faulting/dirtying mechanisms purely for its side effects!
So an object would have two addresses, a physical address for the data,
and a "logical" address for the faulting!
Incidentally, as a MISC advocate, I think that most
of current processor's bloat would better be avoided
by giving much simplified (and hence faster) architectures
where features are exposed to the system software,
instead of being composed in hardware and give a slow result
with the illusion of a security and efficiency.
If there is any positive effect in the hype-oriented Java technology,
it will be to make it easier to explain that efficiency, security, etc,
are to be obtained in software compilation technology,
not hardware bloating technology.

> You could have your bitmap be finer-grained than your page protection.
Yup.

> The only thing you can't do is turn off the protection on a
> finer grain.
Why not? Assuming your program is correct and doesn't try
to access objects beyond bounds (which is obviously a programming error), 
you get all the fine-grained protection you need
by having one page mapping per object.

> It looks to me like the only thing you're gaining is saving on
> unnecessary traps.
Indeed, when an object has been faulted in,
you needn't maintain trap-based protection just for surrounding objects,
yet you needn't fault surrounding objects together
(which is quite acceptable when using paging for a persistent store,
but not so when using it for a GC barrier).

> That could be helpful sometimes, but it sounded to me
> more like you're talking about granularity of detection, not granularity of
> removing protection.
I was talking about granularity of access control,
which includes detection, protection, etc.

> On the other hand, if you can't find objects given the addresses within
> them, addressing that issue directly is cheaper (memory-wise) than using
> page maps.
Uh? If you have one page map per object, it becomes easy to find
the object info as associated to a software page-table
than by linearly scanning on-page headers or suches
(I don't know what techniques are usually used to identify the "head"
of an object pointed to by an "infix" pointer; all efficient techniques
I imagine will require a lookup to a BIBOP table, anyway, unless
you require _all_ pages to contain aligned pagewise headers to identify
subobjects, which prevents sexy whole-page buffers).

Best regards,

[ "Faré" | VN: Уng-Vû Bân | Join the TUNES project!   http://www.tunes.org/  ]
[ FR: François-René Rideau | TUNES is a Useful, Nevertheless Expedient System ]
[ Reflection&Cybernethics  | Project for  a Free Reflective  Computing System ]
Tradition is the matter of which civilization is made.
Anyone who rejects tradition per se should be left naked in a desert island.