Don't hardwire OS features

Francois-Rene Rideau rideau@nef.ens.fr
Thu, 5 Dec 1996 09:10:21 +0100 (MET)


>>>: Alaric
>>: Fare
>: Alaric

>>>>    With Tunes, you could add a system-enforced prerequisite
>>>> that input to the routine should be of the right format,
>>>> and it would ensure that only objects of the right format
>>>> are given to the routine
>>> 
>>> That'd be a precondition in Eiffel or the proposed ARGOT language.
>>>
>> Only Eiffel just CAN'T PROVE anything about the precondition.
>> It can only test it at runtime. And even then, this means that Eiffel
>> preconditions should be only computable statements.
>> Hence, not very expressive. All that is better than nothing,
>> but this doesn't remotely mean "good".
> 
> Eiffel was designed so it is possible to prove things, just no 
> compilers do much proving beyond type inference. So Far.
>
Surely I should re-read more in detail the Eiffel specs.
But from what I know of Eiffel practice,
only computable checks are used as pre/post-conditions,
which severely limits the expressiveness of the specification system.
Also, building a specification system with the belief
that automated proofs are possible independently of specifications
is illusory. We've known for a long time that you must
develop proofs together with programs and specs,
because there exists no program that can guess the proof for
you in the general case.
   The only way for programs to control efficiently proofs
in a dynamically adapted way is that the proof system be reflective!
Any non-reflective language is <CENSORED>.

>> The problem is when you try to prove something about an arbitrary
>> program chosen at random. We're talking about requiring
>> that programs be specifically designed not to crash the system...
> 
> That can be done quite easily, IMHO. Enumerate the things that cause 
> crashes, and mask them off.
>
No, that can't be. Because there are infinitely
many things that may cause crash,
and masking them off is not computable.

> It becomes hard when there's a "dirty" component, such as the Windows 
> kernel, which we cannot check. It becomes a "back door" our checker 
> cannot analyse, allowing us to sneak faults through!
>
Oh, that's also one of the reasons a proof system
must sit on a proven OS for security to rely on it,
and why Tunes, in its final form,
just cannot be the OTOP prototype running over POSIX,
however useful this prototype might be:
the overall software can't be safer than the underlying OS.


>[cooperative switching vs preemptive switching]
> It's faster when there's a switch, but there needs to be a switch
> check in the core of even the tightest loop unless that loop will
> only run for a (proveably) short time.
>
Not if we use the exception-recovery technique I've described
at length in the tunes-lll list before (see LLL/ page).
Currently, even Linux now uses a (restricted version) of this technique:
the exception handler uses code recovery technique
to replace software test for validity of user-memory access in the kernel.

>> What you are talking about is only trap redirectors.
>> If traps are not to be used, this is already too much;
>> if they are to be used, then you also need a real handler
>> outside of QNX, which will take place, etc.
>
> Minor grammar point - you mean take space, not place. A place is a 
> point, the Latin for place is locus; yet space is an amount, like a 
> volume :-)
>
Oops. This is a mistranslated french word: "place" in french means space.
Typical example of a syntax-preserving, semantic-modifying transformation,
whereas what we mean is to preserve semantics (==meaning), not syntax!
So you see that even for an experienced programmer,
a few seconds being careless will introduce dreadful bugs,
that can only be detected by semantic-checking, not syntax-checking.

> Anyway, back to the thread. The most minimal trap handling code would
> print an apologetic message to some kind of output system, then kill
> that task/process/address space/module/whatever...
>
That's already too little/too much.
In a development system,
In a compiled system, unless in debugging mode,
traps would not be needed as error behavior,
but only, perhaps, as a way to optimize
some tests that only seldom fail.
In any case, this "default" routine should never ever be used,
but while debugging the initial system implementation...

>[Specialized coprocessors: MISC or CISC]
> Depends on the device - more flexible things are more like CISC CPUs.
> By flexible things, I mean graphics coprocessors with their own
> thread of execution in a graphics-based instruction set ("line
> x1,y1,x2,y2,col") and stuff like that.
>
Though by definition coprocessors should have specialized circuits
for their special purpose, again, I'm confident that a MISC approach
for the non-critical parts of the design will greatly benefit
the overall performance/cost ratio of the machine.
Well, if this discussion is to continue,
let's do it on the MISC mailing-list.
Ask MISC-request@pisa.rockefeller.edu to join in...

>>> That's semantically identical to it being in an instruction set! Note 
>>> that qsort(), the quicksort, is in the C instruction set...
>>
>> Nope. Being in a library extension to C is not the same as being in C.
>> The former means that you can strip it off the C implementation,
>> that you needn't special compiler support,
>> and that other features can similarly be added.
>> Making it an instruction means that it has a special compiler support,
>> that other extensions can't have.
>
> The distinction is arbitary. The compiler may just translate "a+b" into 
> "+(a,b)". In fact, consider a system where you can define your own 
> operators at will, like Ada or Prolog.
>
> You could strip floating point off a C implementation, just it'd be
> harder than removing stdio streams! The division between "in the
> compiler" and "in the standard libraries" is quite arbitary; some
> languages just make it more syntactically visible.
>
Well, let's say that the problem is not as much
"what is the status of standardized words?",
as "how can new words be defined?".
   It is foremost that we can dynamically design new objects,
which role standard libraries don't cover, by definition.
Now, granted such an extension mechanism,
anything hardwired in the language rather than built as such extension
will be bloat that makes the original compiler comparably
bigger (more code included),
slower (more cases to test against),
less reliable (more complex: more difficult to write and to prove correct).


>>>> Object code should be irreversibly produced for a particular
>>>> target, while a *high-level* language or maximal possible expressivity
>>>> should be used to represent user-level and/or portable objects.
>>> Until we develop a truely fundamental theory of computation, 
>>> anyway...
>>> 
>> Not sure what you mean.
>
> I mean, a mathematical system for converting from syntax A to syntax 
> B, with the /maximum/ efficiency possible. As in, if syntax A is a 
> turing machine, and syntax B contains a simple instruction for most 
> of what a neural net does, then a neural network implemented in A 
> will use this instruction in B. IE, the ultimate compiler will be one 
> application of a fundamental theory of computation!
>
No need for a lot of fundamental theory to show that
such thing is not computable.


>>> Yeah, but that's not part of the C standardl writing "portable" ANSI 
>>> C, we can't assume anything like that, and have to write rotation in 
>>> terms of << and>> and | and all that stuff.
>>>
>> Two things: 1) GCC can also compile <<|>> statements to ROT.
> 
> It requires unnecessary amounts of raw cunning (and CPU time). A 
> simple "<<<" operator would be much easier to detect!
>
That's always the problem of what is in standard libraries,
what is in language, etc.
No one forbids you to have a standard rot function,
and/or to preprocess C source to replace the "<<<" operator
by an appropriate function call.
Of course, that's what option 2) is about...

>> 2) even if it couldn't without the help of a asm/*.h macro, so what?
>> People do not program C because ANSI C is a nice language
>> that should do everything;
>> C is chosen but because it is actually available on actual platforms.
>> There's no problem with providing features
>> through architecture-dependent files.
>
> Yes, but if we write code for Linux and try to compile it on my DOS 
> machine with DJGPP, it won't find asm/*.h and I'll have to figure out 
> exactly what they did from gcc's error messages and write them 
> myself!
>
That's why you can copy the asm/*.h from Linux,
assuming, of course, you're writing GPLed software,
or rewrite them, if you're not.
[Followup to another place, too]

>> C is architecture-dependent anyway (pointer sizes, etc).
>
> Yes, but our languages won't be!
>
The standard portable languages shan't.
But it might be expedient to use a semi-portable arch-dep layer
for bootstrap purposes only
(not in any case intended as a long-term user-wise standard).

>>> My plan for ARGOT (the 
>>> language) bytecode contains some functional features. I won't be 
>>> studying much functional theory until I get into University, though. 
>>> Unsurprisingly, I want to take a degree in computer science :-)
>>> My admissions interview for Cambridge is on Thursday - wish me luck!
>>>
>> Good luck!
>> Do take courses in functional language theory and practice.
> 
> Yes, they're the only topic in computing [that interests me] that I 
> can't find much [comprehensible] information about on the 'net!
>
There are lots of books and articles on the net about functional
programming. I don't know if there are good Tutorials, though.
Try the docs for the OCAML language, to begin with;
I believe there's a good tutorial, or at least a user guide.
I you have plenty of RAM to waste,
you might prefer to get the more standard SML/NJ.


>>> Yup; I don't see how this can be done without the programmer's 
>>> input, though - as in completely automatically!
>>
>> Complete automaticity means that *one* programmer did it *once*,
>> and the user doesn't have to worry about it thereof.
>
> Well, I'd call that cunningness on the part of the OS when it comes
> to sharing information transparently accross wide area networks! I'd
> call full automaticity the seperation of the real [normalised]
> semantics of a class, and the derivation from that of a normalised 
> (and thus identical for all "classes" that do the same thing) 
> representation. Phew! Way beyond current technology!
>
Automaticity always originally come from a human-written program.
Of course, the task of writing programs
(like deriving representation from specification)
could be partially automatized,
which is exactly what reflection is all about!!!

>> Sure. But that's the same as above:
>> once it's done for the first time,
>> no one needs to care about it anymore,
>> unless specifically tweaking it for optimization.
> 
> Yes, I agree, now I see what you're aiming at!
>
I'm glad my english is not completely ununderstandable, after all...


>>> Things like "array" are 'atomic' for transmission, in that they 
>>> explain themselves, as do things like integers.
>> Yes. I wouldn't use the term "atomic", though.
>> Rather "terminal", "context-free", "self-sufficient",
>> or "easily reifiable", or perhaps only "pure".
> I think "atomic"'s a nice word; it implies we don't have to deal with 
> internal structure (meaning "indivisible")!
I would oppose counter-arguments, but let's not argue about names.
Follow-up to private e-mail if you want.

>> I see no reason to define overkill image formats
>> just for the sake of uniformity. If you need the original image,
>> take it, instead of wasting bits in a 32:16:16 Yu'v' stuff.
>
> Yes, if a protocol can be defined to agree on this. If two objects 
> are of totally identical class (in terms of internal representation), 
> then just blit 'em accross as they are!
>
Which means that the system should be able to consistently
manage multiple versions of the "same" object...


>>    The name epsilon comes from the Hilbert epsilon symbol,
>> as used in formal logic, where
>> epsilon x.P which represents a object x fulfilling the property P,
>> if there exists any (if there doesn't, either the statement
>> is considered invalid, or an arbitrary object might be returned).
> 
> Oh, right. In set theory we define subsets like so:
> 
> {x epsilon super_set; x satisfies condition}
>
Nope. The "is element of" symbol is not an epsilon.
Try looking in a dictionary of feeding TeX with \epsilon
to see what it's like.

>> Basically, epsilons are to existential quantifiers
>> what lambdas are to universal quantifiers
>> (also see Curry-Howard isomorphism).
> 
> Oh boy!
>
This means that "epsilon x.P" is any object fulfilling property "P",
which, if well defined, constitutes a proof for formula "exists x.P",
while "lambda x.E" is a function that associates value of expression "E"
to an object "x", and, if well defined, constitutes a proof for
formula "forall x.(formula proven by E)".

>>    Only epsilons are NOT computable in the general case,
>> so that no one before me seems to have had this ridiculous idea
>> of using some epsilon-calculus in computer programs!
>> But after having tried hard to formalize reflective programming,
>> I found that epsilon-calculus was exactly
>> what reflective programming was about:
>> externally and dynamically defining tactics
>> to fill those holes in computable foundation.
>> Much more general than late-binding
>> (one of the two principles of traditional "OO", besides encapsulation).

> Ahah. Have you heard of the Self system?
Yeah. Tried it on a Sparc20 with "only" 32 megs or RAM.
Just couldn't stand the slowness, so didn't explore it much.
I like many things about it,
but surely, a language for proofs is not what they tried to achieve...

>>>[Security concept for ARGON]
>> Ahem. And did you choose scoping as the basic protection paradigm,
>> as in Tunes?
> It's possibly semanticall equivelant to scoping. 
Why not simply make it scoping, then???

> Such that if you are "told about" an entity, you can access it. But you 
> can only tell something about an entity to a certain level.
Which is scoping with views on objects.
That is, one has the green view on the object in scope,
but not the indigo.
This might be a simple way to standardize access to objects
at some external level; it requires no special core system support;
all the core system need implement is scoping.


> To present a unified interface to this, I'm naming the access levels 
> after colours.
> 
> none
> red
> orange
> yellow
> green
> blue
> indigo
> violet
>
You biggest security problem is likely to be with traitor commie mutants...

> When you create an entity, you get a violet-level key returned. One 
> of the violet-level standard features of an entity is to provide 
> lower level keys, so you can pass your friend a green-level key.
> 
> The exact meaning of each colour depends, of course, on what you've 
> programmed the entity to do, but to stick to a standard scheme.
> 
> none = no access possible.
> red = opaque visibility; you cannot use the entity, but you can ask 
> it what it is, that kind of thing. Usually granted to container 
> entities.
> ...
> violet = adminsitrative powers
Shouldn't it be ultra-violet?
Or perhaps that's reserved to single-user system debugging?

> It's simple and flexible...
>  ...perhaps I should try and define a fundamental calculus of access 
> control? There are so many schemes, but the field feels like there's 
> a mathematical basis to be found...
>
Let's keep it simple, and make it scoping plus standard naming.

== Fare' -- rideau@ens.fr -- Franc,ois-Rene' Rideau -- DDa(.ng-Vu~ Ba^n ==
Join the TUNES project for a computing system based on computing freedom !
                TUNES is a Useful, Not Expedient System
URL: "http://www.eleves.ens.fr:8080/home/rideau/Tunes/"