Programing theory, specifications, ...

Patrick Premont premont@cs.toronto.edu
Thu, 7 Dec 1995 07:48:34 -0500


Hello everyone,

A few weeks ago, I requested the list of recipients of the mailing list,
we were 42. So although it's been very quiet lately, it seems there is still
a fair amount of interest.

I've taken a very interesting course this semester and since it's relevent
to the tunes problem of doing proofs about programs, I thought I'd talk about
it. The course is called Programming Methodology and the professor is
Eric C. H. Hehner (he's doing research in the area so you might be interested
in looking up his papers or his book on the subject).

Although it's called programming methodology, it's not about the informal
stuff of software engineering. It's about giving a formal theory of
programs and specifications of programs to allow the programmer to
prove that his programs are correct.

In tunes, it seems to me we will have to have a formal specification
of the semantics of every programming construct that may appear in programs
so that we can prove the safety of programs, the validity of optimizations,
etc. So we will both have a formalization of the semantics of our language(s)
and a theorem prover to reason about programs. At that point it seems 
reasonable to make the best use of these features by allowing the
programmer/user to program using these formal methods : by giving the
specification of the desired program and by, if the theorem prover is
not smart enough to do it by itself, giving the refinements.
In case you don't know what a refinement is, I'll explain.
I'll then give a tiny example of programming by specification
and go on with why and how it could be used in tunes.

A specification S is refined by another specification P if all
programs the follow specification P also follow specification S. But
what is a specification ? In this theory, it is simply a predicate
that depends on the initial and final states of the desired
computation (let call these states s and s'). For a particular couple
s, s', the value P(s,s') of the predicate is true if ending in s' is
acceptable when we start at s, otherwise the predicate is false.
With this definition, the refinement of specification S by P is simply :
Forall s, Forall s' S(s,s') <= P(s,s').

The process of programming by specification is then to give an initial
formal specification S of the desired behavior and refine it by a
special kind of specification called a program. A program is a
specification which constains only implemented constructs. This is
rarely done in one step.  You first refine S by some S1 which is a bit
simpler to implement, then you refine S1 by S2, ... until you get to a
completely implemented specification P (a program). (Refinenement is
transitive since implication is). 

This process is also a good model for compilation. Where S is the
program in the source language and P is the program in the target
language. It is also a good model for hardware construction where S
is the machine language program and P is the description of the
logic gates that implement it.

To define the language in which the specifications will be writen, we
must give some implemented constructs (for programs) and some more
powerfull but unimplemented constructs (for high-level specification).
Here are a few implement constructs we could have :
x := e  (assignment, x is a varible and e is an implemented expression,
         it is defined formally by (x := e) = (x'=e and y'=y and z'=z) 
         assuming x,y and z are the only variables in the state s and
         x',y',z' are their equivalent in the state s')
        (we must define what an implemented expression is but let's
         say here that it's a expression about natural numbers with + and x)
P1.P2   (sequential composition, P1 and P2 are programs, this means do
         P1 then do P2 and is defined formally by :
         (P1.P2)(s,s') = Exists s'' P1(s,s'') and P2(s'',s'))
if a then P1 then P2
        (a is an implemented boolean expression and P1 and P2 are programs,
         the meaning is given by 
         (if a then P1 then P2) = (a and P1) or (not a and P2))

an exemple of an unimplemented construct could be
Forall v in V, S
but also all boolean expressions on the state vaiables :
x'=y and y'=x and z'=5

Let's say we want to implement the previous specification, a program
that refines it is :
(z := y) . (y := x) . (x := z) . (z := 5)

It swaps x and y and assigns z to 5 as specified but does it using only
implemented constructs so it is a program. We still have to prove that
is refines the initial spec. but it's easy here, just plug in the
definitions of := and . and simplify using the applicable axioms and theorems
(for exemple, x=y and y=z => x=z, etc.)

This was a very short introduction but I hope it help give some idea
of how this could be used in tunes. I think it would be great if we
could manage to use this all the way from high-level specification to
the HLL then to the LLL then to machine language. (I don't think going
to hardware is very usefull because I don't think we'll have a
hardware manufacturing plant available to implement the programs in
logic gates :)

But the theory of programming seen in the course has a well defined
target language (set of implemented constructs) which are at a higher
level than machine language. In fact the target language has
arbitrarily big integers and other stuff which make it unimplementable
in a machine language. So we would have to formalize the CPU
intructions and use that as our set of implemented constructs. There
are many things in the theory that need to be adapted to our "real
world" application. I'm not saying it would be easy to do but I think
we'll have to do this formalization in a form or another if we want
proofs. I suggest this type of theory (which is the only one I
know). Not going down to machine language MAY simplify the task but it
will also allow (and therefore practically assure) the presence of
bugs in the machine language implementation of the LLL. And such bugs
will have the power to hang the whole system. It'd be like prooving
that your C source is correct and then hoping that the C compiler is
correct. So going all the way down has great advantages.

In that light, I recently tried to formalize the 486 machine language
(I've started with only one restricted instruction, MOV between registers).
I'll include below what I've done but I wrote this for myself so it may not
be easy to understand, especially since I'm using parts of the theory
of prgramming which I havn't told you about. I'll try to brief you
on the most important ones right here :

---------------------------------------------------------------------------

The comma operator "," creates a bunch. It's like a set without the braces
around the elements. A collection without an envelope. They are used to
represent types. The order and number of occurences is irrelevent.
5 is the numeral 5 but can also be seen as an elementary bunch.
5,6,7 is the bunch of 5, 6 and 7
Bunches distribute over most operations and fonctions so
(1,2) + (30,60) = (31,32,61,62)

x,..y means the bunch of integers going from x to y (excluding y but
      including x)

nat is the bunch of all natural numbers

There is the operator ":" which DOES NOT sitribute over bunches because
it says that the bunch on the left is included in the one on the right.
Equality doesn't distribute over bunches so you can say Bool=true,flase to say
that Bool is the bunch (type) of true and flase.

So I can say that x is of type y by saying x:y.

There is the obvious type string. Exemples: "hello", ""

The operator -> applies to to bunches. X -> Y mean the bunch of all functions
which apply at least to all elements of X and always map to elements in Y.

So I can say, to specify the type of increment,
increment : nat->nat
and the to define it
increment x = x+1
so you see we've got application of function A to B by writing A B.

The curryfication convention is used so all function take exactly one
arguement. Skip the paragraph if you know what it is. If you want more
you return a function that accepts another single arguement which will
return a function which accepts another ...  until you have all your
arguement and the last function returns the value.  So nat->nat->nat
can be seen as a function of 2 nats into 1 or as a function of 1 nat
into a function of 1 nat into 1 nat.

And finally the opeartor "|" which takes two functions and return another.
(f|g) applied to x gives (f x) if x is in the domain of f and (g x)
otherwise. Since the state of the machine is represented using
a function instead of variables, | is used to build state that differ slightly.
("EAX"->4) | (45->6) | S where S is a state (see the definition below)
will mean the state (function) just like S except register EAX contains 4
and address 45 contains 6.

Oh, yes, the state applied to "t" gives the time in clock ticks. 
So time is seen as a part of the state and is affected by instructions.

And relation or predicate is just a currified function of many
variables returning bool. (Of type nat->nat->nat->Bool for exemple)

I've writen comments as %%

What I've done follows, I'm looking forward to renewed activity on the
mailing-list and comments on my suggestion. Sorry the length of my
message but that's as short as I could make it and still be undrestood
by the majority of the persons on the list (I hope).

Patrick


---------------------------------------------------------------------------

%% 80486 theory

%% 16 Megabytes
Addresses = 0,..2^24

%% Registers
General_registers = "EAX","EBX","ECX","EDX"
Index_registers = "ESI","EDI"
Pointer_registers = "EBP","ESP","EIP"
Segment_registers = "CS","DS","ES","SS","FS","GS"
Flag_register = "EFLAGS"
Control_registers = "CR0","CR1","CR2","CR3"
Debugging_registers = "DR0","DR1","DR2","DR3","DR4","DR5","DR6","DR7"

Registers = General_Registers, Index_Registers, Pointer_Registers, Segment_Registers, Flag_Register

%% Basic integer types
Word : nat->nat
Word n = 0,..2^n

%% The state variable is S.
S : state
state = (Addresses->(Word 8) | Registers->(Word 32) | "t"->xrat)

Eight_registers = General_Registers, Index_Registers, "ESP", "EBP"
Register_code : Eight_Registers->(Word 3)
Register_code = "EAX"->bx000 |
                "ECX"->bx001 |
                "EDX"->bx010 |
                "EBX"->bx011 |
                "ESP"->bx100 |
                "EBP"->bx101 |
                "ESI"->bx110 |
                "EBI"->bx111

%% MOV instruction on registers. The first two conjuncts say that
%% the code at the instruction pointer must be the one for the MOV
%% and the last one says how the state is affected by the instruction.

MOV : Eight_Registers->Eight_Registers->State->State->Bool
MOV r1 r2 S S' = ((S (S "EIP")) = bx10001001) and
                 ((S ((S "EIP") + 1)) = bx11*2^6 +
                                       (Register_Code r2)*2^3 +
                                       (Register_Code r1)) and
                 (S' = ("t"->((S "t") + 1)) | 
                       ("EIP"->((S "EIP") + 2)) |
                       (r1->(S r2)) | S)


%% ===========================================================================
%% Data types
%% ===========================================================================

%% Rep is a relation defining a representation, not a special construct.
%% Many representations are posibles so other similar relations may be 
%% defined and used. But they do not have to depend on a location l
%% and a state S as in this particular case.
%% For example Half x y = (y = x/2) can be seen as a representation
%% for nat using 2*nat or for 2*nat using nat or for rat using rat...

Rep x (Word 8) l S = x:(Word 8) and ((S l) = x)

%% The first 2^8 even numbers :
Rep x 2*(Word 8) l S = x:2*(Word 8) and ((S l) = div x 2)

%% The first 2^8 prime numbers : (assuming we have a function prime(x)
%% returning the xth prime prime(0)=2, prime(1)=3, ...)
Rep x prime(Word 8) l S = Exists y:(Word 8) ((S l) = y and prime(y)=x)

%% Now, some multi-byte data types :
%% big-endian
Rep x (Word n*8) l S =      x:(Word n*8)
                        and b = (n-1)*8
                        and Rep (div x 2^b) (Word 8) l S
                        and Rep (mod x 2^b) (Word b) (l+1) S
%% little-endian
Rep x (Word n*8) l S =      x:(Word n*8)
                        and b = (n-1)*8
                        and Rep (mod x 8) (Word 8) l S
                        and Rep (div x 8) (Word b) (l+1) S
%% For either of the preceeding theorems, the following property can be proved.
%% It should be proved because it says that Rep x (Word n*8) l S places
%% absolutely no restriction on the value of S at "t", at all registers
%% and at all addresses not in (l,..l+n). It will be need to show for
%% exemple : Exists S (Rep 4 (Word 8) 3 S) and (Rep 5 (word 8) 4 S)
Forall n:nat Forall S:state Forall x:(Word n*8) Forall l:(addresses-n+1)
  Exists S'      (Rep x (Word n*8) l S') 
             and Forall l2:(addresses,"t",Registers) (not l2:(l,..l+n)) => (S' l2 = S l2)

%% Array bignums (big-endian style)
Rep x nat l S =     x:nat
                and if x<=2^31 then Rep (mod x 2^31) (Word 32) l S
                               else (    Rep (2^31+(mod x 2^31)) (Word 32) l S
                                     and Rep (div x 2^31) nat (l+1) S)

%% ===========================================================================
%% Operations on data types
%% ===========================================================================

(Add32 lx ly lz S S') <=>
  Exists x,y     (Rep x (Word 32) lx S) 
             and (Rep y (Word 32) ly S)
	     and (Rep (mod x+y 2^32) (Word 32) lz S'))

%% CAREFULL !!! THIS DOES NOT SOLVE THE FRAME PROBLEM
%% It doesn't say that S' is just like S except at lz,..lz+4 !
%% It's a good thing too because who knows what intermediate storage
%% has been used or affected (registers,flags,some memory?).
%% The Add32 expression should be conjoined with further restrictions
%% to says that S' is further related to S.
%% (Add32 and reasonable frame problem solution) can be refined by
%% a load, followed by the ADD machine instruction and a store


%% ===========================================================================
%% Examples
%% ===========================================================================

MOV "EBX" "ECX" S S' . MOV "EAX" "EBX" S S'
= Exists S'' MOV "EBX" "ECX" S S'' and MOV "EAX" "EBX" S'' S'
= Exists S''
                 ((S (S "EIP")) = bx10001001) and
                 ((S ((S "EIP") + 1)) = bx11*2^6 +
                                       (Register_Code "ECX")*2^3 +
                                       (Register_Code "EBX")) and
                 (S'' = ("t"->((S "t") + 1)) | 
                        ("EIP"->((S "EIP") + 2)) |
                        ("EBX"->(S "ECX")) | S)
             and
                 ((S'' (S'' "EIP")) = bx10001001) and
                 ((S'' ((S'' "EIP") + 1)) = bx11*2^6 +
                                           (Register_Code "EBX")*2^3 +
                                           (Register_Code "EAX")) and
                 (S' = ("t"->((S'' "t") + 1)) | 
                       ("EIP"->((S'' "EIP") + 2)) |
                       ("EAX"->(S'' "EBX")) | S'')
= Exists S''
                 ((S (S "EIP")) = bx10001001) and
                 ((S ((S "EIP") + 1)) = bx11*2^6 +
                                        (Register_Code "ECX")*2^3 +
                                        (Register_Code "EBX")) and
                 (S'' = ("t"->((S "t") + 1)) | 
                        ("EIP"->((S "EIP") + 2)) |
                        ("EBX"->(S "ECX")) | S)
             and
                 ((S'' (S'' "EIP")) = bx10001001) and
                 ((S'' ((S'' "EIP") + 1)) = bx11*2^6 +
                                            (Register_Code "EBX")*2^3 +
                                            (Register_Code "EAX")) and
                 (S' = ("t"->((S'' "t") + 1)) | 
                       ("EIP"->((S'' "EIP") + 2)) |
                       ("EAX"->(S'' "EBX")) | S'')
%% (one point law)
=                ((S (S "EIP")) = bx10001001) and
                 ((S ((S "EIP") + 1)) = bx11*2^6 +
                                        (Register_Code "ECX")*2^3 +
                                        (Register_Code "EBX"))
             and
                 (((("t"->((S "t") + 1)) |
                    ("EIP"->((S "EIP") + 2)) |
                    ("EBX"->(S "ECX")) | S)
                    ((("t"->((S "t") + 1)) |
                      ("EIP"->((S "EIP") + 2)) |
                      ("EBX"->(S "ECX")) | S) "EIP")) = bx10001001) and
                 (((("t"->((S "t") + 1)) |
                    ("EIP"->((S "EIP") + 2)) |
                    ("EBX"->(S "ECX")) | S)
                    (((("t"->((S "t") + 1)) |
                       ("EIP"->((S "EIP") + 2)) | 
                       ("EBX"->(S "ECX")) | S) "EIP") + 1)) = bx11*2^6 + (Register_Code "EBX")*2^3 +
                                                                         (Register_Code "EAX")) and
                 (S' = ("t"->(((("t"->((S "t") + 1)) | 
                                ("EIP"->((S "EIP") + 2)) |
                                ("EBX"->(S "ECX")) | S) "t") + 1)) | 
                       ("EIP"->(((("t"->((S "t") + 1)) |
                                  ("EIP"->((S "EIP") + 2)) |
                                  ("EBX"->(S "ECX")) | S) "EIP") + 2)) |
                       ("EAX"->((("t"->((S "t") + 1)) | 
                                 ("EIP"->((S "EIP") + 2)) | 
                                 ("EBX"->(S "ECX")) | S) "EBX")) |
                       (("t"->((S "t") + 1)) |
                        ("EIP"->((S "EIP") + 2)) |
                        ("EBX"->(S "ECX")) | S))

=                ((S (S "EIP")) = bx10001001) and
                 ((S ((S "EIP") + 1)) = bx11*2^6 +
                                        (Register_Code "ECX")*2^3 +
                                        (Register_Code "EBX")) and
                 ((("t"->((S "t") + 1)) |
                   ("EIP"->((S "EIP") + 2)) |
                   ("EBX"->(S "ECX")) | S)
                   ((S "EIP") + 2) = bx10001001) and
                 ((("t"->((S "t") + 1)) |
                   ("EIP"->((S "EIP") + 2)) |
                   ("EBX"->(S "ECX")) | S)
                   ((S "EIP") + 3) = bx11*2^6 + (Register_Code "EBX")*2^3 +
                                                (Register_Code "EAX")) and
                 (S' = ("t"->(S "t") + 2) |
                       ("EIP"->(S "EIP") + 4) |
                       ("EAX"->(S "ECX")) |
                       (("t"->((S "t") + 1)) |
                        ("EIP"->((S "EIP") + 2)) |
                        ("EBX"->(S "ECX")) | S))
%% for the next step, it must be seen that (S "EIP") + 2 : Word 32
=                ((S (S "EIP")) = bx10001001) and
                 ((S ((S "EIP") + 1)) = bx11*2^6 +
                                        (Register_Code "ECX")*2^3 +
                                        (Register_Code "EBX")) and
                 ((S ((S "EIP") + 2) = bx10001001)) and
                 ((S ((S "EIP") + 3) = bx11*2^6 + (Register_Code "EBX")*2^3 +
                                                  (Register_Code "EAX"))) and
                 (S' = ("t"->(S "t") + 2) |
                       ("EIP"->(S "EIP") + 4) |
                       ("EAX"->(S "ECX")) |
                       (("t"->((S "t") + 1)) |
                        ("EIP"->((S "EIP") + 2)) |
                        ("EBX"->(S "ECX")) | S))
%% associativity and shadowing of functions in selective union ("|")
=                ((S ((S "EIP") + 0)) = bx10001001) and
                 ((S ((S "EIP") + 1)) = bx11*2^6 +
                                        (Register_Code "ECX")*2^3 +
                                        (Register_Code "EBX")) and
                 ((S ((S "EIP") + 2) = bx10001001)) and
                 ((S ((S "EIP") + 3) = bx11*2^6 + (Register_Code "EBX")*2^3 +
                                                  (Register_Code "EAX"))) and
                 (S' = ("t"->(S "t") + 2) |
                       ("EIP"->(S "EIP") + 4) |
                       ("EAX"->(S "ECX")) |
                       ("EBX"->(S "ECX")) | S)