[gclist] memory protections and system calls

Paul R. Wilson wilson@cs.utexas.edu
Sat, 29 Jun 1996 12:27:48 -0500


>From: Fergus Henderson <fjh@cs.mu.OZ.AU>
>Message-Id: <199606291628.CAA01288@mundook.cs.mu.OZ.AU>
>Subject: Re: [gclist] memory protections and system calls
>
>Paul R. Wilson, you wrote:
>
>> In our view, a protected page should be treated the same way by the kernel
>> as by user code---the kernel should reflect the access violation back
>> to the user process in the form of a user-level signal, so that the
>> app can deal with it---typically by unprotecting the page and doing
>> some bookkeeping before returning.  Then the kernel should resume and
>> do whatever it's been asked to do.  (Like writing data from a buffer.)
>
>I believe the traditional Unix approach has been for the OS call to
>return an error status and set errno to EFAULT ("Bad address").  What
>you suggest would clearly be nicer from a user's perspective, but from
>an OS implementor's perspective I believe it is a lot harder to
>implement.

I'm curious about this.  I think many years ago I knew something about
this issue, but my memory is fuzzy.

I'm wondering whether a well-designed (multithreaded?) kernel would
have a problem with this.  Is it just an artifact of ancient UNIX
design, which was originially not even preemptive?  I assume that the
issue is some sort of locking problem---the system call essentially
tries to lock all of the resources it needs (like pages it needs to
access) before actually doing anything, and backs off without side-effects 
if it can't get them, rather than locking them as it goes and
reflecting problems back to the user process in mid-call.

What problems arise if the system call has to stop in the middle and invoke
the user-level handler?  Is the problem that it may be holding locks on
other resources that it already acquired?  (Conceptually, that is.  In
a classic UNIX, no explicit locking is necessary because only one process is
in the kernel at a time anyway---the whole kernel is locked.)

I guess my question is whether this is a problem that simply goes away
with a good multithreaded kernel, or whether there's a really deep problem
that you may get unpredictable deadlocks depending on what the user-level
handler does.  (Or maybe due to interactions with other processes.)

>The program can handle things by checking for EFAULT, figuring out
>which address is causing the problem, invoking the appropriate handler
>function to unprotect the memory, and then reinvoking the system call.
>It could perhaps be done transparently by the library function which
>invokes the system call.  (The whole thing is similar in some ways to
>the handling of EINTR.)  Of course I don't imagine that your average
>standard C library does this...

Right :-(.  Our problem is that our p-store and GC are used with programs 
that may link against arbitrary libraries, and we'd rather not constrain
that any more than we have to.  The tracing tool bites the bullet and
uses binary wrapping (a linkage trick) to redirect the relevant calls
to the standard C library so that they go through wrapper functions, but
we'd really rather not have to put that into the p-store and GC too.  Right
now our linker dependencies are minimal, and we'd prefer to keep it that way.
We're in the market for more elegant kludges :-).

Maybe we should have our wrappers check for EFAULT and fix things, rather
than using extra buffer copies.  I'm not sure how easy that is---I don't
know if there's a standard way to get the faulting address from an EFAULT.
(We already have a bunch of #ifdefs to do it for protection trap signals,
because sigcontexts are not very standardized.)


>-- 
>Fergus Henderson <fjh@cs.mu.oz.au>   |  "I have always known that the pursuit
>WWW: <http://www.cs.mu.oz.au/~fjh>   |  of excellence is a lethal habit"
>PGP: finger fjh@128.250.37.3         |     -- the last words of T. S. Garp.