Processor Independent Operating System (PIOS) Information


Wed, 19 Oct 94 10:55:53 MET


Hi, Mike !
Let me compare your PIOS with my former MOOSE project.


> Processor Independent Operating System (PIOS)
Why not PIOS ?
But Unix also is a PIOS, then.
Well, the name is not that important.


##############################################################################
> 1.  How To Participate
> ======================

> None of the ideas set forth are cast in stone. I 
> am versatile, and will do what I can to make sure this idea gets off the 
> ground.  
That's good. But remember any project deeply *needs* a referee, and won't 
go faster than the referee. So after all the discussion has been done, a
fast decision must come (even to be modified later should *new* elements
come). The MOOSE project once died because it had neither referee nor
voting assembly, just raw discussion.

Please read my former MOOSE organization file to see how proposed it should
have worked (I tried to wake the project, but everyone left).



> If you have decided to join, please read the points for discussion.
> Then e-mail me your positions on those points and comments.  Please also
> include a brief description of yourself, which I will compile into
> a contributor profile to be distributed to all.  This description should 
> include education, relevant projects, papers, etc, and small section about 
> you personally.


> Also include the level of participation you would desire and which aspect 
> of the project you interested in (an estimation of the number of hours
> each week you would like to contribute would be helpful).

  I'm interested in all the parts of the project, except the low-level
specific device-driver stuff (say, write a SCSI interface for such board
-- yuck). 
  I also hate *unix* programming, so if there's low-level thing to code, I
prefer writing a direct OS (through the PC BIOS to begin with) rather than
coding it over *unix*. Not that I deny interest in writing the stuff over
*unix*.
  Let's say a regular 4 hours a week + more when needed. We'll see if it
matches reality.



> Last of all, (I hate to have to do this), please list three references,
> with one being a professional or academic staff member.

  What do you mean by that ?
  Studies I have accomplished ?
  Jobs I have done so far ?
  Friends or teachers I may have ?
  Someone who can say "yes, Mr. Rideau is officially enabled to join your
project" ?

I've finished my masters in C.S. (even though I'm still struggling to obtain
the diploma). The thesis was about translating logical expressions from
a language (B, kind of Z cousin -- based on explicit substitutions and jokers)
to another (Coq -- based on lambda calculus).

No job so far (well, giving elementary cs courses to students, or math
examinating them). I'm also paid for my studies as a student of the Ecole
Normale Superieure, which is considered as a job.

Net Friends of any CS proficiency ? Say members of the projet FORMEL,
FORMEL@inria.fr.

Teacher or Executive ?	Mr. Cousot (cousot@ens.fr) is the boss as for my
studies. Mr Beigbeder (beig@ens.fr) is the system engineer at the ENS.

Someone who will empower me to join you ? Noone but myself, I fear. If you
do earn reasonable amount of money, I'll need the school's board to agree;
but else, I need nothing.


#############################################################################
> 2.  Battle Cry... How to Win
> ============================

  Implementing the OS in a portable fashion over foreign OSes is exactly what
should be done. But don't expect having the version that runs over OS X run
faster than raw OS X itself !!!

> [Also writing a killer app to demonstrate what the system can do that others
> can't]
  The UDD (Universal distributed Database) will arise one day !

> [Also implement the OS directly over common hardware]
  (say, PCs or PPCs or whatever)
  That's good: demonstrate that we offer not only power, but speed
at low cost. But convincing the Fortune-1000 is harder than that: you
must provide something that I can't give myself -- maintenance over years.
You need some large organization for that.

> [Using universities to launch the project]
That's an idea. But there already are a bunch of university-specific systems
running accross the world. How should we compete with *Amoeba*,
*Grasshopper*, or *STAPLE* (and perhaps *SELF* or *BETA*) ?


> Exploit that size of your microkernel, most computers out there are hidden
> in microwaves and cellular telephones.  Compete for the position to connect
> all of them together.  NT and beepers do not mix.  Despite pagers abundance,
> they are ignored.  And here come the set-top boxes and ISDN interfaces...
???

> [have a ultra-portabble OS as a glue to computer]
Yep.


> Battle from on top of their OS, not from below!  You can win with 
> technological merit and marketing genius, you do not need marketing bucks or
> huge R&D budgets.

That's all ok to me !



> 3.  Mission Statement
> =====================
> We are here to design a microkernel based operating system to serve
> as the basis for the next generation of higher-level operating systems 
> and consumer electronics.
I'd prefer a no-kernel OS. Why need a kernel at all ? Let's have a
decentralized system. Of course we locally have conventions, but locally
means any convention can be later replaced by a better one some day, and
the OS part independant from the convention still be valid. This means
programming all that in a *generic* language.


>  The OS is to be able to optimize the use of
> computing resources by re-distributing the computional burder of
> applications,
> during execution, over a dynamic, heterogenous, wide area network.
[isn't that burden ?]
Yep. Though the farther a node, the lesser interest there is in using it for
a small job, and the greater interest there is in buffering objects imported
from it.


> It should execute on the largest range of platforms possible, from household
> appliances to supercomputers.  The OS will be the common denominator of all
> systems.
Yep. I'd love to use all those old forgot 80's 'puters forgot in their holes
and have my OS migrate tasks to them or from them through serial/parallel
lines. Computers will never be obsolete, just slow and tight !


> No backward compatibility will be designed in, so as to allow the 
> highest degree of design freedom.
But local implementations *can* provide emulation if someone is
masochistic enough to write it. They eventually *will* if the project
becomes big enough.


##############################################################################
> 4.  Design Goals
> =====================


> Process Migration in a Dynamic Heterogenous Environment
> =======================================================
> A little secret for now, but it can be done.
That's no problem. What's more problematic is a heuristic to determine the
cost of migration. We also need a secure system-wide (which may mean
world-wide) object identification protocol.



> Application-Centric Paradigm
> ============================
> All commercial OS's are based on the machine-centric paradigm.  In this,
I'd say a low-level resource-centric paradigm.

> an application is based on one computer and does communications with other
> applications on the same machine or others.  An application is grounded
> to the machine it was initially executed on.
An application must rebuild everything but low-level I/O from scratch.
This is just unbearable. Also persistent storage and human interface are to
be rebuilt everytime, which is 95% of current application programming, whereas
all that stuff should go in generic OS modules.

> Due to this prespective 
> software engineers have had a great deal of latitude in typecasting
> their applications and OS's to one platform, thus reducing their portability.
Because of low-level-ness, programmers must manage raw binary files instead
of well-typed objects. Hence they *must* use typecasting. All this is *very*
unsafe, and implies a *slow* OS with run-time checking everywhere, without
ensuring security for all that.


> An application-centric OS allows applications, during execution, to be 
> migrated to different machines.  Applications engineers can no longer 
> assume certain static attributes of their platforms.  The environment becomes
> a dynamic one.  The job of the OS is to allow freedom of the application.
A high-level OS will hide all the low-level concerns:
persistent storage to contain data, machine choosen to run code, low-level
encoding of objects, object implementation, human interface code (will be
semi-asutomatically generated).




> Logical Configuration
> =====================
> The largest object is the tool box.  A tool box is composed of an array of
> arbitrarily sized data and code segments and a number of agents.  Code 
> segments are viewed as tools.  Data segments are viewed as stacks, however
> data within the stacks can also be accessed directly.  Agents are the 
> execution primitives.  Agents execute tools, and have access to the global 
> tool box stacks, plus their own private array of local stacks.  Agents can 
> move from tool box to tool box carrying their local stacks.
This seems a bit complicated and low-level. Can't we use unifying semantics
a la SELF (everything is message-passing) or BETA (everything is a pattern) ?
or STAPLE (everything is a function) ?
Let's only have arbitrarily typed objects with a global GC as a basis.



> Physical Configuration
> ======================
> Physically, a tool box resides within one working space, and is usually
> serviced by one CPU.  Normally a working space contains several tool boxes.  
When emulating Unix/DOS or working under Unix each working process is itself
an active working space, so no need for multiple CPUs to test our multi*
algorithm.


> Tool Box Migration
> ===============
> At the discretion of the OS a tool box may be migrated to another working 
> space.  At such time all constituent parts of the tool box (tools and global 
> stacks, resident agents) and bundled up and moved to the new working space.  
> At the new working space the tool box is unbundled and all tools are either
> re-compiled, or are deemed better interpreted for overall system performance.
> The intermediate code is retained in case another move is warranted.
Moving big objects is not always beneficient. Allowing small objects to
nigrate seems a better policy to me: when using the "archie" equivalent, the
search process is migrated, but not including the human interface.


> Resource Management
> ===================
> Tool boxes can be viewed as resources.  Each tool box is named and all 
> inter-tool box communications are vectored according to that name.  Names 
> may be shared by tool boxes, in the case of libraries for which there may be 
> instances in several working spaces.  All services provided by tool boxes 
> must, by default, be stateless.
Ok. But again, I prefer having as light-weight objects as possible to reduce
object manipulation overhead.


> Inter and Intra Tool Box Communications
> =======================================
> Agents carry all data in their local stacks.  Typically, parameters will 
> be pushed onto a stack before a call, and popped off during the call.  
> The actual parameter passing format is up to the application.

That's well. But why force using stacks ? Some languages have no stack
(see ML or LISP implementations), and just use a heap. Let's not specify
internal behavior. The only thing is: objects must be able to migrate in
some way whatever. Being able to migrate is the same as being able to be
saved/restored to/from a file (the file just being transmitted over the net
in case of migration).


> Security
> ========
> All communications (agents) are vectored through a gate upon entry to a 
> tool box.  The entry code can choose to do minimal checking to optimize for 
> speed, or extensive checking to maximize security.  This checking is not 
> a function of the operating system, but instead of individual tool boxes.
Yes. let's have a compiler-based security system. All binary code *must*
be secure. When it comes to migrating it, let's have some optional PGP
signature system to ensure that code comes from a trusted compiler or user.


> Intermediate Language
> =====================
> All code is distributed in an intermediate language, below human programming,
> but high enough to be applied to a wide range of microprocessors.  It will
> be simple to interpret, and quick to compile down to binary for speed 
> intensive applications.  It is expected that human-oriented programming  
> languages will be layed on top of this intermediate language.  I would like
> to do an implemention of c(++) for the intermediate language.  Though
> c would not effectively utilize the attributes of the OS, it would satisfy
> the short term complaints of those still bound to c.
Some kind of FORTH or byte-code is good. See the byte-code interpreter from
CAML-light. I've always wanted such a beast.


> Automatic Stack Contraction/Expansion
> =======================================
> I'm not sure about this one yet.  Each stack (tool box or agent)
> grows from the bottom, and is used like a stack.  It has a profile (
> max size, grow/shrink increment, and stack pointer).  Data can be pushed/
> popped from the stack, or accessed arbitrarily inside the range below the SP.
> When the stack overflows, the segment is automatically expanded.  When
> memory resources in the workspace run low, garbage collection can commence
> by contracting stacks that are under utilized (stack top - SP > shrink 
> increment).  I believe this might save space by applications using
> only what they need, and by bundling the memory allocation code in the 
> kernel which might otherwise have many instances in the application code. 
> What do you think?
I love stacks. But why have them as OS primitives ? To me, let the OS
handle arbitrary objects, and have stacks in what resemble the standard
library. Let's have an ever-running garbage collecting system and use it
as a criterion for migration.


> Specialized Platforms
> =====================
> For truly speed intensive applications, the actual application code (tool
> box) would be bundled with the PIOS microkernel and coded in the native 
> machine language.  The tool box would be tied to the machine to prevent
> migration, or an additional copy of the tool box (the intermediate language 
> version) could be migrated.
Again, in a distributed system, some kind of signature must come with any
low-level code, that may be checked to verify that any binary code comes
from a trusted source. Objects could come with their "equivalents" under
such or such assumption; then when migration cost is computed, matching
equivalents are taken into account.


> Optimization of Resources
> =========================
> Tool boxes should include a benchmark tool, which could be compiled
> on a number of different machines to determine which has the best fit
> of architecture to problem.  This benchmarking can take place just before 
> initial execution, or during a re-optimzation of resources during execution.
> Taking this measure, plus that of available network bandwidth, estimated 
> communications demands, etc, the tool box could be placed in the most 
> optimal workspace.  Notice that we are entering into the territory of a 
> priori knowledge of application demands.
My opinion is that compile-on-demand and optimize-on-use is the best policy,
that adapts to the users' needs. See SELF about that.
I think we need some kind of persistent system with lazy-evaluation like
STAPLE.


> No File System?!
> ================
> I don't believe in file systems (maybe I'll change my mind).  In any case,
We still FSes to communicate with other OSes and import/export data, though
I agree they are not a system primitive.

> I'd like for tool boxes to behave like organic systems, going to sleep
> on persistent computers when not in use, being brought back to the fast
> non-persistent computers when being utilized.  What is a persistent computer?
> A hard drive with a very dinky CPU could be viewed as a slow slow computer
> that is persistent, with a very large memory.  Using the same algorithm for
> optimizing the distribution of tool boxes, the less used ones would
> naturally 
> migrate towards the hard drive based work spaces when not in use.  I look
> forward to the day when all computers have soft power switches; ask the 
> computer to turn off, it moves the tool boxes to persistent storage, and then
> turns the power supply off.
To allow some security, we must also provide a regular or permanent logging
process which will ensure that all system change will be written in persistent
memory (that survives power failure). See Grasshopper for that.



> Design Goals Overview
> =====================
> The migration of processes (tool boxes and agents) during run-time in a 
>         dynamic heterogenous environment
 1) The smaller the objects, the easier the migration.
 2) for read-only objects, it may be better not to *migrate* the object, but
to propagate copies.
 3) Now, what if there is a net split ? Will modifying the
object be made impossible ? Then you must copy the object and maintain your
own copy. But then, what when the net is one again ? How to merge changes ?
There can be several object-dependent (object-parametrizable ?) policies.

> Small minimalistic microkernel (10-20K)
Why need a micro-kernel at all ?
We need objects, including memory managers and intermediate-code
interpreters/compilers, but no microkernel. Only conventions.

> Fast
This will come later. Concentrate on the power.

> Application centric
??? I'd say high-level OS. Do not define bitwise behavior like under Unix/C.
Just define high-level protocols, and an abstract intermediate-level language.
Moreover, we'll still need machine-centric layers. Only you'll address
them only if you really need to (i.e. play music on a host, and not another one
ten miles away :)


> Allow for any level of security, based on applications need
 1) Require super-user rights to validate any low-level code before execution;
 2) use the policy: "if the object is addressable, it's usable".
 3) Use run-time optimization (i.e. partial evaluation) a la SELF to achieve
good performance even with "object filters" that allow only partial access
to an object.
 4) Now, as for security, what to do when hosts do *NOT* completely trust
each other ? In a WAN, that's especially critical. The answer is: in the
WAN, all machines are not equal; each machine has levels of trust for other
hosts (including distrust due to net link quality) which will decide it *NOT*
to migrate an object.

> Parallel processing intermediate language
>         not for human consumption
Well, not for non-hacker human. But we will still have to manipulate it.
And if we do, other hackers may want to use it too. Only they'll have to be
superuser on the system.

> Organic System
>         "HDD as slow persistant computer" storage (instead of file based)
>         No file system?!
>         development as an interactive process (FORTH like)
> Implement initial design on top of existing OS's.
>         (distributed file system as improvised network?  Or jump straight
>         in and do a TCP/IP implementation?)

> All original coding!!  No copies of others work (for legal reasons)
  I don't completely agree.
  We can copy these work, as long as we respect their copyrights.
Thus, if we take code from say Linux, this code will stay under the GPL.
But if that code is isolated in a module, it can be distributed with a
separate license from the whole project. That's why I recommend again having
the nano-kernel GPL'ed so we have no more problem.
  Of course, we shouldn't include other *commercial* work if we wanna
avoid paying royalties ourselves...


> 5.  Preliminary Architecture
> ============================

> Microkernel (10-20K)
  To me the Kernel is 0K.
  At boot process, we have a boot loader, which loads modules using some
dumb convention. A second-level or nth-level loader(s) can change the
convention to anything. But basically, we must think in terms of objects
that depend one on the other, each using its convention, and calling other
objects through a proper filter (itself a needed object).
  There's no need about a centralized kernel. What we need is good
conventions. Each convention/protocol will have its own "kernel"/object.
The only requirement is that the system is well founded upon the hardware or
underlying OS.

>         Processor (application, when run on top of an OS) initialization
That's just one basic boot module.

>         MMU functions
Not stricly needed (though quite useful).

>         Creation, Bundling and Unbundling for transport, Destruction of
>                 Agents
That's an important set of modules. Perhaps that's what resemble the most
a kernel, as it will be ever used. But it's no more a kernel than the MM
module with which it interacts, or the IO drivers used to load them.

>                 Tool boxes
Why introduce arbitrary differences between objects ? Let the system manage
a unified set of entities, called objects, agents, frames, patterns,
functions, tool boxes, shit, or whatever you like (though object seems the
most natural). If the user language introduces differenciation between
objects, let it do so. But this is no OS requirement. What we want the OS
to do is providing security protocols, including for type-checking under
and between various languages.

>         Interpreter (in kernel for speed)
Let's not specify thingsin term of kernel, but in term of modules.

>         Compiler (in kernel for security?)
Anything that produces low-level code must be trusted (i.e. supervisor mode),
which does not mean it belongs to the "kernel", even if almost all hosts will
have one.

>         Automatic segment expansion/contraction?
What is that segment stuff ?
Let's not specify implementation behavior, but high-level response from the
system.

>         Workspace tool box mapping/redirection
Nothing in kernel as proper. Everything is loadable/unloadable modules.
But there are various moving conventions.


> Tool Boxes
>         Inter-microkernel packet communications manager
>         Tool box re-allocation algorithms
>         Device drivers (HDD,TCP/IP)
>         General applications
>         High level to intermediate code compiler (gcc?)
>         Development tools
>         GUI
>         Global tool box mapping/redirection
>         Nearby workspace utilization, access speed, etc statistics
>         Intermediate code to binary compiler (as resource for versatility?)
No we don't need any *kernel. But yes we need such things as modules.



> 6.  Agenda
> ==========
It seems ok.
but as for PIOS specifications:
WHAT HIGH-LEVEL LANGUAGE SHOULD WE USE TO MODEL THE OS ?
Clearly C is not a HLL.
We need a language that's well with persistence, concurrency, etc.
BETA and SELF or STAPLE may be such languages/systems.



> 7.  FAQ's
> =========
> Q: What will be the functionality of the microkernel?
> A: The microkernel will be a minimalist one, see part 5.
No kernel at all ! Only moving conventions.


> Q: How do you migrate processes in a heterogenous environment?
> A: Thats a little secret for now.
  The user won't see it.
  Every address external to the moved module will be converted to an absolute
system-wide address before being sent.
  Migration cost and decision is computed as part of a local address
space's garbage collection, which in turn is called when scheduling detects
suboptimal system performance.
  Migrating itself is a particular case of saving and restoring objects.
It's just restoring the object as soon as possible on another machine !


> Q: What type of implementation are you looking at? You mention an 
> intermediate language but what? Have you considered using object 
> oriented technology?
> A: See part 4 for implementation.  The intermediate language would
> be a proprietary one.
> Object oriented technology in programming languages,
> for the most part, is for programmers, not for computers (I know
> I'll get hell for saying that).  This OS is for computers.  (More
> inflammatory remarks expected...).  Object oriented technology, as far as 
> viewing system components as objects which can talk to each other, is what
> we are trying to accomplish, without relying on OO programming principles.

SELF is an OO language that already allows much of what PIOS needs.
BETA also is an OO language that has persistence and distributed objects
over Unix (though the current implementation is quite trivial)
(BTW, C++ or Objective C are definitely *NOT* OO languages; they are just
bullshit to me).
What I mean is the OS should include type-checking protocols, conventions or
mechanism, or else system security and consistency cannot be ensured.
Allowing the most generic type-checking (so that any language
can be implemented using such protocol) leads us to something
"object-oriented".




> 8.  Points for Discussion
> =========================

> What shall be the organization of our group?
>         I opt for a for-profit business even though we do not have funding
>         as of yet.  If all our work just benefits the research community
>         then great.  But if there could be buck to be made for all of us in
>         the end, then even better.  I'll be pushing for equity distributions
>         during two phases of the project.  These distributions will be
>         based on the number of hours put into the project, capital invested,
>         or equipment donated.  A $/hour rate for programming (which 
>         likely will be inflated to compensate for the risk) will be arrived
>         at.  Equipment will be valued at the street price.  Everything else
>         we will vote for and agree upon before accepting.  The distributions
>         would be 30% first round distributions/30% second round
>	  distributions/40% myself.

  You're getting a huge share ! Let's see what the others say. If you work
full-time, perhaps you deserve it. But let's have a per-work ratio (with a
multiplier for you if you like).

  My opinion is whereas the killer-apps and migration modules such should be
commercial/shareware, the "micro-kernel" (that is, enough code to run the OS
on a single CPU system) itself should be released under the GPL, so that the
system be wide-spread accross the net.
  Such an official net support *is* important (see the Linux community). That
didn't prevent Linus from earning directly and indirectly money from his work.
But it helped the Linux community to multiply accross the world. And I know of
no OS with such a support.
  And if our killer-app is as good as we say, it deserves commercial
distribution above the GPL'ed OS. We can also put a condition on the GPL that
a GPL'ed version of the OS cannot be used in a commercial environment, which
will force companies to buy OS licenses, whereas people can use it freely (but
are welcome if they pay).
  In any cases, don't expect to earn much money from that project. If you do,
that's all the better. And if you don't, I'd rather not being paid for a
successful thing than not being paid for a forgotten project.



> What will be the structure of our organization?  Benevolent dictatorship,
>         republic, or just where in between?  Who will provide direction ?
>         Who will represent us?  How should we resolve differences?  Etc.

Basically, you get best results if you decentralize decision so that
each subject has its own maintainer. But in case of conflicts, you need a
referee (say the subject maintainer or ultimately you). If we are to discuss
and vote, then it must be quick; that's why we need regular meetings whenever
common subjects are discussed. TALK/IRC sessions are welcome. Mail is viable
only if participants reply very quickly.
(under IRC, I'm Fare when connected. Please send me a message if you see me).



> What kind of participants do we need?  Where do you fit in?  What would you
>         like to do?
We need People who participate *regularly*, and reply quickly.
The one week latency between message & answer killed MOOSE.
I am ready for all language-related topics, including implementing the
intermediate language, compiling to it and from it.

> What is a reasonable development schedule?  Do we even need one?
>         (See agenda section.)
Oh yes we need one. MOOSE had no schedule and it died because nothing was
done.
Typically, we should have regular meetings and schedule for next meeting;
if mail is quick enough, we can have it. But TALK/IRC sessions are better
if we are to vote in real-time after discussions. Also, we should write
good english in mails as is slows the process, if we can use symbolic stuff.
If you know IRC, we can have an IRC channel with an IRC bot to keep it open.
THIS TOPIC IS OF UTTERMOST IMPORTANCE.


> Should we widely disseminate our developing technologies (a la Linux), or
>         should we all sign non-disclosure agreements and keep this a
>         BIG SECRET?  Is there a middle ground?
  I think the OS should be open, so we have a wide-spread internet support
is possible. The misfeatures and bugs become much easier to correct this way,
and the OS will be able to spread like Linux.
  Then, we can add a close in the standard license, so that people who make
money out of the system must pay royalties to the authors. The killer-apps
can be developped under non-disclosure agreements if you see the needs.

> Start thinking about terminology.  We are going to be breaking some new
> 	ground here, and there is an opportunity to coin some new terms.
> 	Most importantly though, we need to agree on the usage of current
> 	and new terminology so we can communicate.
There are already a lots of systems whose terminology we can reuse. But yes,
an official glossary is needed and when we communicate, we must have a
one-to-one word -> meaning mapping. Thus we also need a referee for words.
Again, I suggest that the subject maintainer ultimately decides (after vote)
while the global referee (you) will end discussions if there still are.
Glossary is of course modifiable, if *new* arguments come for a new
terminology.


references:
SELF:		ftp://self.stanford.edu/pub/papers
Grasshopper:	http://www.gh.cs.su.oz.au/Grasshopper/
Xanadu:		http://www.aus.xanadu.com
STAPLE:		ftp://ftp.dcs.st-and.ac.uk/pub/staple
MOOSE:		ftp://frmap711.mathp7.jussieu.fr/pub/scratch/rideau/moose/

and see the comp.os.research FAQ.

in the MOOSE directory, get papers.zip