OS design...

John Morrison jm@mak.com
Tue, 27 Oct 1998 08:21:15 -0500


--------------8803E56E67C9A06AD4B30436
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Hi Ray,

I hope you all can read this .. I think Netscape is HTML-izing this
mail.  My profuse apologies if you cannot...

I like the "modular" approach and I think it could be extended
very effectively to runtime as well as buildtime, and cover
*EVERY* subsystem of the kernel except managing processes,
memory, and sockets.

Yup.  I'm with you...  I'll return to this point below...

I'm picturing each and every hardware driver and operating
system service as a standalone process -- not a library to link
against.  User processes would communicate with them using
sockets.  Higher-level services, such as window management,
could be standalone processes that communicate with lower-level
services such as screen management over sockets, etc.  This
could keep user code small and fairly simple, and allow the
kernel to drop code and processes related to any services that
aren't being used, at any time, which would keep the memory
footprint small.

How about a thread instead of a process (distinction being not separate
per-process address spaces, like you recommend below)?  How about shared
memory versus sockets (distinction being no copying of data -- again, I
think this is what you recommend below)?

Suddenly, the kernel is not married to *ANY* assumptions about
the hardware, other than the availability of addressible memory.
If someone wants to make an embedded system with no screen or
keyboard, then those programs ("persistent drivers?")  just
don't get loaded.

Absolutely.  Under most monolithic kernels, both device-driver entry
points and processes are "processor contexts" that get "scheduled"
periodically, but by entirely different mechanisms (if you've ever had
the misfortune to write a UNIX device driver, you'll know what I mean --
routines that are named the same thing as the user libraries, but do
ENTIRELY different things).  If you have a single-address-space machine,
you can reduce the device-driver case to the already-solved thread
case.  Just make device drivers normal threads, one per device.  Handle
everything the same way.

Furthermore, in the case of loadable kernel/driver modules (e.g., Linux,
Solaris, etc.), both this loadability and the loadability of "vanilla"
programs are handled by entirely different means.  Again, reduce them to
the same thing in a SASOS (single address space OS).

This is exactly what a few of us over at www.jos.org are trying to do
with the particular Java OS variant we are working on.  The "native code
runtime" just sets up the hardware (no drivers, just puts the x86 into
flat, 32-bit protected mode, and it will eventually enable paging) and
contains the JVM and primordial loader.  (Like I said before, I *really*
want a LispM, but figure to get there via Java, which I don't like
NEARLY as much.  I'm pretty sure the other co-conspirators have no idea
that LispMs ever existed.  More's the pity.)

The drivers would be written in Java (again, unfortunately, not Lisp).
As you indicated in your first paragraph above, the drivers are not
"loaded" at build time (i.e., they are not linked into a monolithic
kernel), but are loaded dynamically.  Unfortunately, to complicate
matters somewhat due to the unavoidable constraints of booting without
the benefit of a "host" OS, the plan is to load the drivers in a sort of
"staged' manner -- the primordial loader will need the "right" drivers
in order to find the rest of the drivers/OS.  For example, if you're
netbooting, the primordial loader has got to be able to find the right
ethernet driver.  Similarly for boot disk drivers and FS implementations
when booting off a local disk.

And, very importantly from our point of view, it would raise the
API above the level of the calling stack and raw pointers into
code, making it so it really *didn't* matter what language you
programmed in.  Also, such a simple  kernel ought to be quite
easy to implement or port.

Again, yes, in my case, Java (instead of Lisp, unfortunately).  It's
only about 5,000 lines as of this writing, and about 1,500 are
boilerplate for different interrupt-handler entry points in both C++
(static member functions) and assembler (very little) for each of 256
interrupts.

This would require several things:

1) *VERY* efficient socket implementation -- as shared memory
   space, with copying cut as far as possible.  Forwarding a
   socket's input to another socket's output should be just a
   matter of copying two pointers and notifying the kernel that
   the buffer has been allocated to a different process. Note,
   different methods would be needed for inter-machine
   communications.

Exactly the plan, but with the simplification that the buffer is an
ordinary, vanilla Java object (again, unfortunately, instead of a Lisp
object).

2) a standard "language" for communicating dynamically typed
   objects via sockets, as well as raw bytes.

Actually, we would simplify this to using Java's dynamic strong-typing
(i.e., let the language handle it).

3) a very efficient multitasking implementation -- probably
   with all processes sharing a single address space and using
   credential-based resource protection schemes. (which I don't
   really know enough about yet to say whether that's feasible
   or effective).

Utterly the Plan.  Although, for the first couple of revisions, there
really won't be any security to speak of.

As for status, there's been little further progress on the native code
part (it's done enough to boot the PC).  Right now, the JVM is about 50%
complete (in terms of number of bytecodes implemented), and it's being
integrated (by another co-conspirator) with the native code runtime as I
write this.

Other ideas?  Comments?  Rebuttals?

There's actually another semi-related way to look at this -- the
Stanford Cache Kernel (see
    http://www-dsg.stanford.edu/papers/cachekernel/main.html

They have restricted the resource-management functionality a
"cache-kernel" must implement, simplifying it considerably, and enabling
one to implement one's own virtual memory policy on top of the
cache-kernel.  Say, maybe, you want to write cooperating GC and virtual
memory subsystems, for, say, writing a Lisp-based OS...  Hmmm....

-jm


--------------8803E56E67C9A06AD4B30436
Content-Type: text/html; charset=us-ascii
Content-Transfer-Encoding: 7bit

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
Hi Ray,

I hope you all can read this .. I think Netscape is HTML-izing this mail.  My profuse apologies if you cannot...

I like the "modular" approach and I think it could be extended
very effectively to runtime as well as buildtime, and cover
*EVERY* subsystem of the kernel except managing processes,
memory, and sockets.

Yup.  I'm with you...  I'll return to this point below...

I'm picturing each and every hardware driver and operating
system service as a standalone process -- not a library to link
against.  User processes would communicate with them using
sockets.  Higher-level services, such as window management,
could be standalone processes that communicate with lower-level
services such as screen management over sockets, etc.  This
could keep user code small and fairly simple, and allow the
kernel to drop code and processes related to any services that
aren't being used, at any time, which would keep the memory
footprint small.

How about a thread instead of a process (distinction being not separate per-process address spaces, like you recommend below)?  How about shared memory versus sockets (distinction being no copying of data -- again, I think this is what you recommend below)?

Suddenly, the kernel is not married to *ANY* assumptions about
the hardware, other than the availability of addressible memory.
If someone wants to make an embedded system with no screen or
keyboard, then those programs ("persistent drivers?")  just
don't get loaded.

Absolutely.  Under most monolithic kernels, both device-driver entry points and processes are "processor contexts" that get "scheduled" periodically, but by entirely different mechanisms (if you've ever had the misfortune to write a UNIX device driver, you'll know what I mean -- routines that are named the same thing as the user libraries, but do ENTIRELY different things).  If you have a single-address-space machine, you can reduce the device-driver case to the already-solved thread case.  Just make device drivers normal threads, one per device.  Handle everything the same way.

Furthermore, in the case of loadable kernel/driver modules (e.g., Linux, Solaris, etc.), both this loadability and the loadability of "vanilla" programs are handled by entirely different means.  Again, reduce them to the same thing in a SASOS (single address space OS).

This is exactly what a few of us over at www.jos.org are trying to do with the particular Java OS variant we are working on.  The "native code runtime" just sets up the hardware (no drivers, just puts the x86 into flat, 32-bit protected mode, and it will eventually enable paging) and contains the JVM and primordial loader.  (Like I said before, I *really* want a LispM, but figure to get there via Java, which I don't like NEARLY as much.  I'm pretty sure the other co-conspirators have no idea that LispMs ever existed.  More's the pity.)

The drivers would be written in Java (again, unfortunately, not Lisp).  As you indicated in your first paragraph above, the drivers are not "loaded" at build time (i.e., they are not linked into a monolithic kernel), but are loaded dynamically.  Unfortunately, to complicate matters somewhat due to the unavoidable constraints of booting without the benefit of a "host" OS, the plan is to load the drivers in a sort of "staged' manner -- the primordial loader will need the "right" drivers in order to find the rest of the drivers/OS.  For example, if you're netbooting, the primordial loader has got to be able to find the right ethernet driver.  Similarly for boot disk drivers and FS implementations when booting off a local disk.

And, very importantly from our point of view, it would raise the
API above the level of the calling stack and raw pointers into
code, making it so it really *didn't* matter what language you
programmed in.  Also, such a simple  kernel ought to be quite
easy to implement or port.

Again, yes, in my case, Java (instead of Lisp, unfortunately).  It's only about 5,000 lines as of this writing, and about 1,500 are boilerplate for different interrupt-handler entry points in both C++ (static member functions) and assembler (very little) for each of 256 interrupts.

This would require several things:

1) *VERY* efficient socket implementation -- as shared memory
   space, with copying cut as far as possible.  Forwarding a
   socket's input to another socket's output should be just a
   matter of copying two pointers and notifying the kernel that
   the buffer has been allocated to a different process. Note,
   different methods would be needed for inter-machine
   communications.

Exactly the plan, but with the simplification that the buffer is an ordinary, vanilla Java object (again, unfortunately, instead of a Lisp object).

2) a standard "language" for communicating dynamically typed
   objects via sockets, as well as raw bytes.

Actually, we would simplify this to using Java's dynamic strong-typing (i.e., let the language handle it).

3) a very efficient multitasking implementation -- probably
   with all processes sharing a single address space and using
   credential-based resource protection schemes. (which I don't
   really know enough about yet to say whether that's feasible
   or effective).

Utterly the Plan.  Although, for the first couple of revisions, there really won't be any security to speak of.

As for status, there's been little further progress on the native code part (it's done enough to boot the PC).  Right now, the JVM is about 50% complete (in terms of number of bytecodes implemented), and it's being integrated (by another co-conspirator) with the native code runtime as I write this.

Other ideas?  Comments?  Rebuttals?

There's actually another semi-related way to look at this -- the Stanford Cache Kernel (see
    http://www-dsg.stanford.edu/papers/cachekernel/main.html

They have restricted the resource-management functionality a "cache-kernel" must implement, simplifying it considerably, and enabling one to implement one's own virtual memory policy on top of the cache-kernel.  Say, maybe, you want to write cooperating GC and virtual memory subsystems, for, say, writing a Lisp-based OS...  Hmmm....

-jm
  --------------8803E56E67C9A06AD4B30436--