Comments on UniOS documents

David Jeske jeske@home.chat.net
Mon, 4 Jan 1999 01:24:52 -0800


I originally sent this as a private message to Pat, but I think others
may be interested in the commentary.

----- Forwarded message from David Jeske <jeske@home.chat.net> -----

Date: Sat, 2 Jan 1999 21:58:23 -0800
From: David Jeske <jeske@home.chat.net>
To: Pat Wendorf <beholder@ican.net>
Subject: Comments on your UniOS documents
X-Mailer: Mutt 0.94.13i

I like the ideas you present in the documents you've written. They are
very similar to the kinds of ideas I've had. You can read some of my
thoughts at "http://www.chat.net/~jeske/unsolicitedDave/". Here are my
specific comments on each one:

** Visual Development Environment: I think a visual development
environment is an important part of our progress forward. There are
many systems out there which have done interesting things in visual
development. I'd break them loosly into two categories: 

 1) visual organizers, which display the 'objects' of your program and
    possibly their connections in a visual way, but which display code
    in the standard text form.
 2) visual programming languages, which attempt to display programs in some
    kind of graphical form instead of textual. These usually include 
    visual orginazation.

** Command line GUI interface: There is an Engineering Design environment
called "Mentor Graphics" which has a similar feature. Because
unmodified keypresses had no role, you could actually just start
typing (you didn't have to hit a key to bring up the command box) and
it would open up a box and put your commands in there. It was insanely
poweful. In fact the whole environment was very powerful. I think the
idea of using 'natural language' to ask the computer to do things is
something which we will definetly see in the next generation of
operating systems.

** Code Form Applications: Traditional source code is actually (IMO) a
very poor form for distributing applications. There are many
dependencies involved in turning source code into a runnable
application including: compiler flags, header and library versions,
linker settings, makefiles etc. I believe that the idea of
distributing programs in a form which can be easily 'target compiled'
into a native binary is a good idea, however I don't think any source
code to date will fit the bill. The general idea, however, is
certainly not a pipe-dream philosophy. Systems have already
demonstrated the ability to do this, it just needs to be refined to
the point where it's more useful. Examples are: Java VM, ANDF, TaOS,
Juice, SmalltalkVM, SelfVM, and even Javascript.

** OS Installs: I think many of the points you bring up are valid for
today's systems, and we certainly should see them remedied in new
systems. However, I think the problems don't necessarily apply to a
new system which has a different architecture. For example,
orthogonally persistant systems like KeyKOS and EROS already don't
work this way, at least not exactly. Of course they have other
problems.  I personally prefer to look at this issue by saying I want
"robust, undoable, and trackable component/application
installation". That is, I should be able to install anything without
disrupting an old version of the same item, I should be able to test
the new item in a sandbox, I should be able to switch to the new item,
and I should be able to switch back. All OS 'updates' should be
components which operate in this manner. As a result of this kind of
philosophy, I think most of the problems you describe should go away.

** Database FS: In an interface/object/implementation based system,
storage should just be creating objects. These objects can be stored
via some complex database code, or simple seralization. The choice is
going to affect the speed with which they are accessed, but it should
be possible for (a) the system to dictate the storage type independent
of the applications that use the objects, and (b) the applications to
perform operations on the objects independent of the storage
type. That means, I should be able to store _all_ my data in a big
database if I decide it's important for me to be able to perform
queries on object fields or names, or values. In addition,
applications which need to perform queries on collections of objects
should be able to work, even if the objects are stored in a flat
serialization and the search has to be performed the "hard way". In
other words, as long as we do a good job of abstracting interface and
implementation, and reflecting objects, we should be able to use
database technology pretty seamelessly.

** Program Management: I suggest you change the title to "application
management", at first I thought you were talking about some kind of
"project/process management" because that's what the job title
"program manager" does. Or perhaps "software management". I agree with
your premises. Although I would take a step back and instead of
arguing for a 'single file', how about a 'single atomic item'. That
item could be a single file, or a heirarchy of files, or a database
store, or whatever the application needs to store it's
information. However, I definetly agree that installing or removing an
application should be a simple matter of 'inserting' or 'removing' an
atomic item. Any dependencies on that application should be tracked in
the system. If you ask to remove an application the system should know
what other programs are going to 'break', and after it's gone, it
should know what configuration data is potentially 'dead' and can be
removed at the user's convinence.

** Document Centric: I think you'll find the distinction falls away if
you consider the term 'content centric'. After all, there are several
games based on the same Quake game engine. Whether it's "Quake"
vs. "Hexen", or "my term paper" vs "my tax report", it's the content
which is important, not the software used to visualize and operate on
it.  (IMO of course)

** Games First and Foremost: I think the biggest problem with game
development for varied platforms like PCs today is the trouble with
the ever-present tradeoff between 'control and compatibility'. Game
programmers like to control the delivery platform as much as possible
in order to control the game experience. That's why they like
programming for consoles, because they know exactly what the client
machine is. No wondering about what kind of framerate they are going
to get, or what kind of resolution they can run. I believe that by
making it easier to both target custom configurations (i.e. make
target specific optimizations and tuning), and by allowing the same
software to be compatible with the largest range of hardware, a new OS
could win over the game developers. In other words, make it a better
way for them to program games for their existing customerbase
(i.e. win95), and allow the games to run elsewhere as a side-effect.

** Comments on your three OS architecture suggestions: Unfortunatly, I
find these three papers too general to have any meaning. I also feel
that you are talking about low-level implementation specifics of a
system which is not very well defined. (BTW, I'm not trying to flame,
I'm just pointing out that other than references to existing kernel
architectures, I don't understand what you're trying to say, or how
these three architectures are even different from eachother.)

That said, the title of your 'no kernel' paper, makes me think of an
OS architecure I've been thinking and talking about quite a bit
recently. If we consider the existing OS models, we might say:

macrokernel: all abstractions go in the kernel, including hardware
access code (i.e. drivers), but also including other abstractions like
TCP/IP,  TTY and process control, etc.

microkernel: all abstractions go outside the kernel, including
hardware access code, TCP/IP, TTY. Only IPC and basic process control
are microkernel functions.

exokernel: all hardware access code goes inside the kernel, all
abstractions go outside the kernel. That means hardware drivers for
devices like network cards, scsi cards, etc, live in kernel space,
while network stacks, filesystem stacks, or other added-on
abstractions live outside the kernel. In fact, the entire body of code
which presents the semantics of a given operating system live outside
the kernel in something called a 'library OS'.

The orginization I've been considering might be described as: all code
is loaded onto the system in an abstract and safe (non-machine-binary)
form. Code lives where it should to obtain the greatest
performance. Code may even migrate as necessary. The semantics of
accessing objects should be the same whether they manifest themselves
as 'drivers', 'shlibs', 'servers', or 'applications'. When code is
written, it should not know where or how it's going to be run, but
only what it's going to do, and how it's going to be accessed.

For example, with today's hardware code which deals with hardware
access would be compiled into a form to be used in the 'ring 0 --
kernel space'. This is so that it can have the fastest access to
hardware items like interrupts, and dma. However, the operational
semantics of accessing a 'hardware driver' would be no different from
accessing any other object. In this way, software can be transparently
'layered' above even the most basic hardware drivers, because to
client software, any object looks the same. For example, a complex
piece of software which combines a collection of disks into a RAID
array should be able to export a device which looks exactly like a raw
disk, even though it's probable all the RAID code will live in 'user
space' while the raw disk drivers will live in 'kernel space'.

It may be adventageous for some code to live in the same address
space, in which case it would be compiled into the same address
space. If it needs to live in separate address spaces (for more space,
or extra safety, or whatever), then the machine code to handle IPC/RPC
between the two objects can be generated by the system. The
operational semantics of the objects should not be different based on
where they live. 

Even more importantly, because objects are in a 'safe' form. It should
be possible to 'layer' software without the traditional performance
implications of indirection. This is because multiple objects can be
'merged' together in the raw-compiled form. Systems like Self
(self.sunlabs.com) and to a limited degree, the MIT Exokernel's
network packet filtering, have demostrated how a great performance and
flexibility win can be gained by using strategies like this. Self
inlines several levels of a method call chain into a large,
type-specific, codeblock. They cache these codeblocks in what they
call a 'polymorphic-inline-cache'.

For example, in Self, if there is a loop which iterates over a
collection of objects and calls a method (say "draw") on those
objects, it might compile a static version of that loop for the case
where all the objects in the list are of a single type. In this way,
it can make one static loop with all the instructions to draw inlined
into a single codeblock, even though they came from five or six
different level of method calls. If an object in the list is of a
different type, it'll jump out of the static codeblock and fall out of
the 'fast-path'. However, after it does this several times with the
same object type, it will compile a new version of the block which has
a new fastpath for both the first and the second object types.

I hope that was sufficient to explain some of the ideas of this
orginization. I'll save the rest of the description for another time.

-- 
David Jeske (N9LCA) + http://www.chat.net/~jeske/ + jeske@chat.net

----- End forwarded message -----

-- 
David Jeske (N9LCA) + http://www.chat.net/~jeske/ + jeske@chat.net