This is a test of the Moose broadcasting system...


Tue, 23 Feb 93 5:58:09 MET


ASAP !
(and sorry for the delay)
 By the way, I think you didn't receive my last mail to the group, for I
did not add your name in my mailing alias ('twas before the mailing list)
(the alias was taken from one of your letter's header). If you did, don't mind
this message. I send you last version here (not including my next posting
to the mailing list); use gnu diff to know what's new since last version you
received.

 Read you soon,
   ,
Fare

--- here, included last version of my specs ---
Hello, Dennis & the Moosers !
 Here are my wishes for the new OS.
 You'll see that I think the OS and the language stuff are interlinked. IMHO,
you can't conceive one without conveiving the other.

 As I speak french and am not completely fluent at english, I will commit many
mistakes. Do not hesitate to correct my english, it will help me improve it,
and you won't have to translate ill-written and difficult-to understand
english as you read my letters.
 As I study math, I don't really care for how you NAME things (though I have
some opinions about what would fit the best and not recall unwanted things).
Just tell me if you name something otherwise from me, or let's have some
discussion.

 I do not ask to forget C/C++, but to use them with another spirit (more
generic and more multithreading/parallel), in expectation of a quick
replacement.

 A wish is preceded by ***, subwishes by *, arguments for them follow,
begun by --> (tell me if you use another format, to harmonize transfer)

 I think we'd better divide our specs following the subject and/or logical
layer of the objects discussed.

 This file isn't finished yet, but I send it anyway.

/************************* General structure  *******************************/

*** ANYTHING is LOGICAL, virtual.
*** there is a UNIFYING notion for ANYTHING, even EXECUTABLE CODE, even
ITSELF. call it CLASS, OBJECT, FUNCTION, SCHEME, TYPE, FUCK or any name you
like, call it GOD if you want, but there it is: it exists, it just EXISTS !

*** a STANDARD NEW STACK LANGUAGE is given, both as handy do-anything language
to exchange and manipulate data between windows, as scripts for common
utilities, and as standard intermediate language for compilers.
-->Why a STANDARD language ?
 A standard intermediate code for compilers allows to have common front part
for a compiler in any specific language independently of hardware; it allows
to have the same terminal part of the compiler for different languages. This
was a dream in the time of early compilers; it is a nightmare with C.
-->Why a NEW language
 At this time we have C/C++ but it isn't a good language as it doesn't give
the compiler enough info to do both code optimization and protection checking
or program proof.
 All this will help VERY much further compiler designers (if the language is
well designed).
--> Why a STACK language ?
a stack language is easy/immediate to interpret, and/or transform back to
tree for further optimization. It is easy to obtain as compile object code,
and easy to use as compile source code (if you add powerful enough
declaration stuff to just FORTH).

* If you have a good language interpreter, no more need for a stupid shell
which does not do anything and systematically calls external files ! No more
need to know the different syntaxes of all the different utilities.


/**************************** Logical stuff *********************************/

****** ANYTHING is primarily LOGICAL, then PHYSICAL
* Actually, there are many layers of logical links between objects,
but at the opposite of today's OS and non OO enough languages, the primary
layer is the LOGICAL one, NOT the PHYSICAL one. When you conceive something,
you begin to think about how it will link to others; it is up to the language
and the system below to

*** I PROPOSE WE ACTUALLY BEGIN PROGRAMMING FROM THE UPPER, LOGICAL BOUND:
Let's have our system work within another old one (DOS/Unix).
 We can still begin programming from the other, hardware bound, and the
two programming will join in the middle, but LET'S NOT BEGIN HARDWARE
PROGRAMMING BEFORE LOGICAL CONCEPTION.

*** <std_example.h> you can have a file as a text string and vice versa, so
that anything you programmed for a file can be used with a string.
 To begin with, Let's just translate DOS/Unix files to logical low-level
files.
*** <std_example.h> once a physical support has been defined
files on it are automatically and IMPLICITELY defined when needed, for any
file storing standard (with defaults depending on the physical support).
Then, you can read an Apple ][ DOS 3.3 file in a MS-DOS .DSK file if you
just have defined it once, and use it as a standard file in any common
application program.

*** More generally, every logical structure or algorithm in Knuth's
"the Art of Computer Programming" will be included in standard files.

*** some structures can be cast to others:
sequences to array, array to sequence, list to record, record to list,
list to set, set to list, etc.
 The system will know all these standard logical casts, and will execute them
itself if/when needed. He will choose itself (considering the info given by
the user) what format to use for the data, if you translate once and work with
the translated data, or if you translate calls to the object's method as they
come (depends on which takes more memory or time, and how much time or memory
you have to do the job,...).
 Many a physical format do match with a logical structure, but each operation
on it then takes different times to execute, that's why
*** SCHEDULING IS INCLUDED IN THE SYSTEM, AND THUS IN THE LANGUAGE.
(or vice-versa)

*** Files are only a representation among others of logical objects. You can
reference an object in a file as any other object. The system will know if
it has to expand part of the file or the whole file to real memory.

*** The system contains a huge number of more or less interlinked objects.
Each has a separate life, but can live only as long as (more or less directly)
linked to an object known to live (the running system, mass memory).

/********************** Standard objects in the Kernel *********************/

TASKS,THREADS,PROCEDURES
*** A program byte has not only one char input and one char output (and
possibly one error char output). It may have any number of implicit or
explicit input or output (with many defaults if not entirely told by the user)
of any nature (not only raw char streams ...).


/*************************** Physical implementation ***********************/

*** We must include (before the end of the project, but obviously not at the
beginning) every possible physical format of each logical structure previously
defined.

MEMORY ALLOCATION
*** This include in particular MEMORY ALLOCATION.
There is a logical class called OBJECT ZONE, which will have different
implementation. an OBJECT ZONE can contain others, and so on. Each program
will have its own choosen to be the quickest and/or the less memory user.
For a first time let's define only one OZ type, which won't be appropriate
for any particular use, but which will do anything you can expect from an
OZ, from bare allocation to garbage collection with pointer updating when
full.
* Let's also define sub-OZ which contains only part objects of the OZ.

CLASS SYSTEM
*** Let's define the Class class, its axioms and its constructors.
*** Logical Basic Classes and Constructors
THEOREMS
- Arbitrary precision integer number (staticly typed)
- n-modulus integer (staticly typed)
- Arbitrary precision real number (staticly typed)
- Enumerated types (staticly typed)
CONSTRUCTORS
- Record
- List
- case-union
- Array
- Reference (no arithmetic)
- Explicit Virtual Function: (a function without border effect)
OTHER STANDARD LOGICAL CONSTRUCTORS
- Record,Lists,Arrays with carry report
- Set
- Arbitrary precision integer number (dynamically fixed)
- Arbitrary precision real number (dynamically fixed)
- Strings (=character lists=low-level text files)
- Queues, Deques, Stacks, etc.
SYSTEM CLASSES
- Object
- Class
- Container
 - Object Zone (logical container)
 - Memory System (physical container)

*** Physical Basic Classes and constructors (on the PC)
AXIOMS
- boolean
- local int types (depends on the implementation)
- character type
- local real number types
CONSTRUCTORS
- Array
- record
- union
- physical_pointer

/************************ SHELL / USER INTERFACE ****************************/

*** Let's have a STACK Language shell: it is then easy to combine results
from previous operations, then treat them.
 I remember when I changed from Apple ][ to MS-DOS, I was first VERY depressed
because I couln't do small calculation at the DOS prompt, neither could I
use files and run programs in BASICA, whereas in Applesoft, I could use the
full file system, and even call Assembler.
 Now I have some tools in my PC for VERY simple calculations, or VERY
complicated ones, but I must still use my 10 TIPS (thousands instructions per
second) HP48 for calculations rather than not my 3 MIPS computer for
quick immediate calculation involving some memory and/or handy computing.
What a shame for MS-LOS !
 With Unix, this problem is less present, because with X-Window you can copy
calculations from a window to another, so that you can build programs which
all do a very little thing, and pipe one into another and cut/paste in
between. The problems are: it only understands ASCII, and moreover ASCII codes
from 32 to 126 only (others are non-standard) ! You swap the whole system and
launch a full task for just stupid calculations. THAT eats computer time.
* But you can still have Unix-like commands: when the interpreter sees an
instruction, if first asks it if it wants to parse the rest of the line !
* The MacIntosh's cut/paste is only a push/pop with a 1 level deep stack !

*** In a standard program, you can DYNAMICALLY change any of the inputs ore
outputs. For example, first put a wait input/output which just waits when
called. Then, the user replaces it by whatever he wants. To verify how the
program behaves, (to debug, to ensure that a long task you launch isn't
unuseful, etc), just put between the input of a program and the output of
another that is linked a intermediate edit/wait/view program that allows you
to see and modify if you want the data flow.

*** of course, we will have a graphical interface at the end; but the
graphical interface is only a means among other to use the stack language
interface: instead of a mere cut/paste, you push/pop from a stack to another;
even more generally, you transfer data from a queue/file/stack/variable to
another; each having a read and a write procedure.

*** keyboard can be shared between more than one user: allows sharing the
terminal. Useful for games, useful when many people share the same computer,
etc. (idem as for screen, with multiple windows). More generally, input and
output devices can be shared. You can have someone using the main keyboard
but function keys and the keypad; another one with the keypad, a third one
with function keys (if they are on the side of the keyboard), others with
respectively the mouse and joysticks. Of course, you can also have more than
one keyboard and/or screen.


/******************************** TOOLS *************************************/

SUBSYSTEMS:
*** (in a distant future) you will just be able to open any other logical
system in a window: PC-DOS, Unix, MacOS, Apple ][ ProDOS, CP/M 80, *OS* will
be supported.
 Programming the sub-OS in the intermediate stack language will allow to port
them to any computer; we will (only) need a good microprocessor emulator
(which will be the execute method of the specific microprocessor class,
inherited by the generic executable code class).
*** But the MOST IMPORTANT OS to emulate is itself: the Kernel must be such
that an OS session can contain another one. That's fun for debugging as well
as simulating itself in a game, or anything; moreover, it's ideal for multiple
sessions in one and multitasking (let the system just contain many times
itself and you're done).

COMPRESSION:
*** The standard objects will include compression algorithm, and the
programmer will be able to transparently compress/put in overlay, etc, parts
of his programs/data, following the algorithm he thinks best fit (for example,
a big random access help file can be coded this way: first, group words and
assign a code for frequently used words. Huffmanize remaining characters.
Huffmanize words. Lempel-Ziv one-block texts and compare. Allow skipping an
unefficient previous step).

/**************************** Little tricks *****************************/

* Directories are a logical structure.
Then a compressed and/or tar'ed file can be a directory (better to use
read-only, but writeable if you wish). This is VERY useful when a directory
must contain a lot of libraries, or a great number of small files (imagine a
program with 1000 modules each a two screen page ascii source file; if you
allocate 1 2Kb disk block/file, it is 2Mb; if you put all the
stuff in a single file, it is but 1Mb; if you compress each file with an
algorithm appropriate for its type (language used) and/or LZ, it will be
300Ko, and quickly readable; moreover, disk access and/or swapping is
limited.


/*************************** Actual Programming *****************************/

* We don't need do all physical representation of every logical structure
now or include scheduling either. Let's just keep a place for it when we have
time, so that we won't have to do it all again then.
* for the moment, let's just do something that works, even VERY slowly, and
with very limited physical format. Let's not even choose optimized format for
our implementation, just have it work and be short to write and easy to modify
later.

** How to manage Multithreading in an unadapted language as C/C++.
let's call regularly the multi-threading manager in our C++ code (do not let
a large loop without a call), with a #define'd word.
(for example PAUSE).
** Further, we can have a translator from C++ to our better language.


/*************************** Hardware interface ****************************/

Here for the 386 version.
*** Will we use flat model or multiple segment model ?
(we can mix models)

--> FOR the FLAT model
 Everything is easier with pointers: they are shorter, pointer arithmetic
is easy and fast.
--> AGAINST the FLAT model
 No memory protection for inter objects exchange

* As I recommend huge use of tiny object, you can't use FLAT model everywhere,
i.e. you can't mix objects which are not SURE of each other, not having been
compiled together. It's impossible to manage it with the FLAT model, except if
you require page alignment for objects which must interfere (that's worse than
previous paragraph alignment !!!). That's why I recommend using the other
model. (but simple programs can still work in FLAT model).

* We can also have the STACK LANGUAGE interpreter run in the FLAT mode, as
we may insure ourselves that compiled programs are correct with respect of
object protection; or have the version of the interpreter check itself if
the instructions are correct.

* Then what of the GDT and the LDT ?
I thought about it and arrived to the conclusion that each Task has its own
GDT, and each Thread its own LDT, or something like that. A common object may
involve one or two threads, and a huge object, but most tiny objects will only
use the current thread's environment.
 The problem is that when a object calls another one, BOTH want to be sure thee
call is correct. If only one object wanted it, everything would be easier.
Most problems can be solved at compile time, but there will always be execution
time link (that's what the user is for), so that beware !
 As range check is easy to handle, the question is mostly about pointers
modification: if someone sets pointers in a mess inside a pointer updating
garbage collecting system, everything may behave unpredictably !
 Then what if multiple objects with multiple LDT's (or GDT's) must interact ?
That's why I think GDT and LDT are deeply unadapted too OO system programming.
Well, we'll have to call the system so that it changes the DT; use some
descriptors as variables. Then we come to
* systematically call the system for object interaction (the system must be
fast, then). If we do that, we do not need as much DT manipulation, and we can
keep a 32 bit key for each object, keeping a standard size for pointers.


/***************************** MY LANGUAGE *********************************/

Here are specifications for the future language I'd like to use.

*** My Language is Meta-itself: a program in the language, when compiled,
produces another, more low-level. As the language includes low-level stuff,
compiling will be a rewriting of the program to use le low-level instructions.
* ADVANTAGES: The language is its own preprocessor. It is easy to CALCULATE
what will become constants; this is NOT an add-on feature.
* DISADVANTAGES: The compiler may not always stop if the Meta-itself stuff is
used; but you're not forced to use Meta-Itself feature randomly; tools are
included (nesting operations, etc) to build clean recursive reference; 
As the compiler is written in the language itself, you can use the same usual
and you can still high (or low)-level
debug the compiler if you do and it doesn't stop. Moreover, the compile
debugger shows you what part of source code is dangerously meta-ing itself.

*** Physical and logical adjectives have only a relative meaning: some objects
are more basic than other. Highly logical objects are standards.

*** It allows multiple possible logical structure for a physical object, and
multiple possible physical implementations for a logical object. The compiler
is meant to find itself (with your help for better results) the best
implementation.
ADVANTAGES: you don't have to look at details anymore (but you still can).
Instead of building one by one parts of a program (which you're still able to
do if you want), you begin to draw its general scheme, then you precise it.
DISADVANTAGES: ? (note that you can still advice the compiler, or force him
to an implementation; use run-time feed-back, etc; ask him to ask you useful
info for optimization, etc)

*** You can program by constraint (in the future).
for example, it must understand things like (x,y)=z with x,y reals, z complex,
or x=y+2, x+y=1, xy=2, or x+y=2, x-y=-4, etc.

------ bip bip other features
- Integrates code as a virtual object type (-> automodifiable programs in
particular)
- types may accept parameters.
- Any object may be created/evaluated at any time/level from compile time to
local procedure execution.
- Ideally, every object lives in parallel with every other; compiler
determines which can live sequentially or not. (THIS is incompatible with any
existing language and OS, which accept parallel tasks only by swapping
megabytes of memory, and leaves local mutual protection of objects up to you
programmer !)