[gclist] memory protections and system calls
Paul R. Wilson
wilson@cs.utexas.edu
Fri, 28 Jun 1996 15:04:41 -0500
We've come across an obnoxious interaction between memory protection
tricks (for persistence, gc barriers, etc.) and system calls. I'm
wondering if anybody else has encountered this, and what they do
about it. (For our current purposes, it turns out not to be a problem,
but I anticipate one when we add reachability-based persistence and GC
to our persistent store, or when we use a VM write barrier for our
incremental generational GC.)
The problem is this: when you access-protect memory and pass a pointer
to something in that protected memory to a system call, most OS's will
choke---they'll signal an unrecoverable error, or silently corrupt
data, or simply die. (We've brought SunOS down several times this way.)
This can come up when you try to do I/O into or out of a protected page.
We view this as a bug in the OS, but we still have to deal with it. In
our view, a protected page should be treated the same way by the kernel
as by user code---the kernel should reflect the access violation back
to the user process in the form of a user-level signal, so that the
app can deal with it---typically by unprotecting the page and doing
some bookkeeping before returning. Then the kernel should resume and
do whatever it's been asked to do. (Like writing data from a buffer.)
This is usually not a problem if people only use high-level I/O routines.
The high-level I/O routines copy the data into a user-level buffer before
doing an actual system call, and pass the system call a pointer into
the extra buffer, rather than the original source buffer. As long
as the extra buffer is in unprotected memory, you're OK---since the copy
from the original source location into the formatting buffer is done
in user-mode code, before the actual system call, the traps occur in user
mode and can be handled correctly.
Problems arise if you do low-level I/O straight into or out of a protected
page. Then the system call gets a pointer to the protected memory, and
is very unhappy about it.
This can happen even to high-level I/O routines if the buffers they
allocate (typically using malloc()) may get access protected. This may
happen (for example) if you checkpoint the normal malloc heap using
write protections, or I guess if you use a pagewise write barrier with
a conservative collector that traces the normal malloc heap.
For a virtual memory tracing tool we're building (which access-protects
all of your data and records the pages that get touched), this is a
problem that we've had to address. We wrap the relevant system calls
with code that does an extra copy in user mode, into a user-level
but unprotected buffer. This is conceptually simple, but slightly
system-dependent and tedious to implement.
I'm wondering if anybody else has noticed and addressed this issue.
(Hans, maybe?) If there's an elegant way around this, we'd be very
interested in hearing about it.