Thoughts on provably secure systems

Francois-Rene Rideau fare@tunes.org
Fri, 14 Jan 2000 03:12:28 +0100


On Tue, Jan 11, 2000 at 01:51:45PM -0500, Paul_Campbell@ENGELHARD.COM
inquires about computing systems designed to provide proofs of security.

Maybe our friends from EROS will comment and provide good pointers.
(I hope Prof. Shapiro doesn't mind being in Cc: of this mail).

There has been recently renewed interested by CS researchers into
proofs of security. Look around the usual suspects for publications
on the subject (Abadi, Cardelli, Leroy, among many others).


Here are the common methods I know to express security in programs:

To prevent illegal memory access, structural typing (static à la ML,
or dynamic à la LISP) has proven the Right Thing.

To delimit per-resource access, a basic approach is to use lexical scoping
(see the W7 article on Review/VMs.html). In the same vein, although with
full (yuck) C virtual machines replacing single procedure scopes, read what
EROS-OS.org has to say about capabilities.

Combined with the above, linear typing has also proved a nice way to
express the fact that although you grant an external agent the possibility
to temporarily access some resource, that resource may not be copied
(data reused after return of function), split (continuation being split
to steal CPU from the caller process), or otherwise duplicated.

Some strong type-systems can also guarantee eventual termination of
evaluation of whole ranges of typeable programs, sometimes with strict
limits on time and space resource usage.

You can invent lots of variants of the above, and maybe even other schemes
I've never heard of (but would be glad to be pointed to).


Do not forget that somehow, proofs=types (isomorphism dual to
the Curry-Howard one). The Right(tm) type-system will allow you
to express the proof you seek (and sometimes automatically infer it).
A language that examplifies it is Lennart Augustsson's Cayenne,
an extension of Haskell where the type-system can be used just for that.
In the proof=type paradigm, you can structurally deduce a proof of
your desired properties from the type-checking process.

However, all these methods amount to high-level typing, and only prove
security against high-level attacks. To guarantee security against
lower-level attacks, you must also have a proof of "full-abstraction" of
your low-level system into the high-level one (i.e. the functor from
your low-level system to the high-level one is "full").


[Sandbox vs proofs]
If you consider the above, you see that this isn't an opposition;
rather, sandboxing is a very particular case of very very limited proof,
and also one that has very coarse-grained verification, since you basically
have to verify the whole sandboxed VM to assert your proof.
If you consider sandboxing as a generic dynamic technique rather than
as a one-time static security measure, you can achieve a lot of proofs
more easily; in this context, sandboxing can actually be considered
as layering your system into proper abstraction levels. In a reflective
system, you get to choose a layering that fits your problem. In other
systems, the system provider does it statically for you.


> The problem is this: how far can "provably correct code"
> be generalized to cover all aspects of resource utilization?
It can be arbitrarily generalized, yet it will never cover _all_ aspects.
You have to choose the aspects that matter to you,
and be sure that the code you use is correct wrt these aspects.

That's where a reflective system helps: instead of providing a single
one-size-fits-all accept-or-bust type system for everyone
(possibly a trivial one, as in C, or one limited to scoping, as in Scheme),
it allows programming partners to negociate
the type-system they use to express their programming contracts.
Thus it allows virtually any contract to be passed
_and enforced by the system_.


> The operating system I am thinking of runs all programs
> in the same address space
If you have a good type system _by default_, then indeed
sharing the same address space _by default_ might be a good idea.
That's how we intend to do in TUNES.

Now, there are still cases where performance, compatibility, security,
debugging, distrust, etc, will force you to run in separate or nested
address spaces (which may be done using hardware capabilities
or purely in software, depending on available hardware, etc).


> What I'm really interested in then is CPU time usage.
Then the tool you want is probably linear- and multilinear- resource-typing.

> Now, in this form, pre-emptive multitasking (yet another context-switch
> like operation) is not really necessary.
Yup, see also http://www.tunes.org/~fare/tmp/implement1/
and http://www.tunes.org/LLL/index.html. The conclusion is that
1) at the low-level, you just cannot preempt anywhere, anyhow.
2) at the high-level, the "cooperation" can be handled automatically
 by the compiler.

[Per-block scheduling]
Instead of doing it at runtime (very costly), do it at compile-time.
And if your parameters change, dynamically recompile.
Often-used code can afford recompilation once in a while;
seldom-used code is interpreted anyway.

> Under PCC in a generalized form (such as LXres)
> this consists of meta-information about the program's
> execution time.
Can you give a pointer to LXres?

> The problem is that interpreting the meta-information by the scheduler may
> itself violate the execution time limitations without even executing
> the actual code!
That's the usual bootstrapping problem: you must verify the meta-code,
and ultimately have axioms about them. Note that with staged evaluation,
verification can be done beforehand, or on another machine, etc.

> Rather than determining actual execution time
> (ala LXres or other PCC systems), it would
> be better to determine a MAXIMUM execution time.
Computing actual time (perhaps conservatively) is the way to prove
that you're within bounds.

[Splitting programs in blocks of max given time]
Yup. This can mostly be done automatically
when transforming abstract programs into programs
with latency requirements.

Of course, accurate timing can be difficult to achieve;
considerinpg cache latency and the ever growing variety
of memory architectures, even a provably conservative model
is difficult to achieve. Although I guess it could be possible
to automate measurement of atomic timing constraints
(another difficult achievement, but a meta-achievement).


> With the proof verification system in hand, the "kernel/user space"
> boundary disappears.
Or rather, the user is free to put boundaries where he desires,
and not where the system implementer decided without knowledge
of the problem. The boundaries can be pushed back at the meta-level,
which can be resolved long before runtime.

> 1. Some processes have a maximum execution time
>  which can be mechanically checked.
More generally, some processes only use resources from a predefined pool.
If this pool doesn't contain resource import features, then
you can guarantee that only the initial resources will be used.

> 2. Some processes do not have an easily measured execution time but will
> terminate.
More generally, you can express in the type system
that some computations do terminate.

> 3. A few processes will never actually terminate
> but they can't be distinguished from #2
> without a more fundamental understanding of program execution.
Then you can't run them and still pretend to be secure
(unless you're the administrator).


Cheers,

[ "Faré" | VN: Уng-Vû Bân | Join the TUNES project!   http://www.tunes.org/  ]
[ FR: François-René Rideau | TUNES is a Useful, Nevertheless Expedient System ]
[ Reflection&Cybernethics  | Project for  a Free Reflective  Computing System ]
No, I'm not interested in developing a powerful brain.  All I'm after is
just a mediocre brain, something like the president of American Telephone
and Telegraph Company.
		-- Alan Turing on the possibilities of a thinking
		   machine, 1943.