Write/read invariance for procedures

Francois-Rene Rideau rideau@nef.ens.fr
27 Apr 1997 21:09:06 +0200

The following message is a courtesy copy of an article
that has been posted as well.

Read/Write invariance for procedures accross sessions
is exactly the same as providing a portable code transport format,
and all problems raised by code migration in general appear.
This is exactly the goal of the Tunes Migration subproject,
and it's also related to the portable VM that the lispOS
project intend to develop for their persistent distributed OS.

lord@emf.emf.net (Tom Lord) writes:
>     It isn't very hard (roughly one page of Scheme code) to
>     maintain write/read invariance for procedures.
> You seem to be referring to EQ?-style invariance where the reading and
> writing are within a single Scheme session.  By "session" I roughly
> mean a single process, although I guess you could have a session that
> spans a single process using migration, unexec, or what have you.
> What if I want to read and write procedures across session boundaries?
> Then I really will need a notation that is more complete than S-expressions.
> I may not be able to preserve EQ?-style invariance, but I ought to be
> able to preserve some kind of invariance that is still useful.
Yeah. I've called that the "object limits" problem,
which is crucial in a persistent/distributed environment
where objects migrate. This is a very difficult problem,
of which I've never seen a complete solution.
Looking for it in the Tunes project.

> A notation for reading and writing structures with shared parts is easy.
> The messy hard part is deciding what parts of a structure to write, and
> which to inherit from the session of the reader.   For example, if I
> have the procedure:
> 	(define (plus a b) (+ a b))
> 	(define (meta-plus a b) (plus a b))

Beware the braindead semantics of define in R4RS !!!
In scheme's global namespace (worse than C!), one could (define + *),
or equivalently (set! + *). Yucky at best.
Scheme clearly lacks any module-friendly semantics
(not that you are forced to follow R4RS' braindeadness with this respect).
More generally, the way Scheme does side-effects just plain sucks.

So I guess you mean something more like
	(define plus (let ((add +))
Or rather even
	(let ((plus (lambda (a b) (+ a b))))
Lots of Insipid an Stubborn Parentheses?
Well, let's forget the syntax, and focus on semantics...

> and write it to a file:
> 	(with-output-to-file ",example" (lambda () (magic-write plus)))
> should a definition for "+" be included in the file ",example"?
You could even simplify that to (magic-write +), ain't it?

Depends what you mean by +.
It might be having a very precise behavior for overflows that ought
to be conserved accross dump/restore;
It might be trying the context dependent addition,
but in a way such that the global program semantics does not depend
on the environmental dependencies that change accross dump/restore;
It might be testing the + instruction of the original processor/environment
that we began to debug earlier and will debug later accross dump/restores.
It might be (from declarative context) addition of 26.45 bit integers.
It might be the location where the contents of the global variable + is bound
(Scheme is even worse than C when it does semantically distinguish
values and locations, but in quite unorthogonal ways, yuck; ML does better;
but, let's stop bashing Scheme).
It might be a bug. It might be forbidden.
It might really be lots of things.

Now, in
	(dump game)
        ... play ...
	(restore game)
	(dump game)
        ... login to another terminal ...
        (restore game)
you want the game variables to be saved,
but you want some session parameters to be dynamically valued
accross sessions, which are statically valued. But how do you specify which?
And what if the set of variables I want to keep varies from dump to dump?
What if I restore the savegame while the original is still running?
Now what if you're one player in a multiplayer game?
What if that game is, e.g. the stock market?
What if at one time you're using the same software
to both do actual transactions and to do simulations?
What does dumping/restoring mean in that context?
How does the system live with the fact that
I can't dump the state of my bank account, spend money,
and restore my original account?

What about a function that depends on some variable,
dumped at some point, and restored later,
after the original function, or other functions sharing the variable,
modified it?
Shall the restored function still use the shared variable?
* If yes, what if this shared value disappeared or is not available
  in the new session in which the function is restored,
  or its manager is disconnected from the current part of the network?
* If not, what if some locking is needed?
  What if I write a function while it got a lock?
  What if there is an essentially unique resource involved,
  whose duplication would break a software or hardware invariant?

So basically, we face deep semantic problems:
before we can dump and restore objects,
we need have a way to define their precise semantics,
which in itself can be quite elaborate,
and which gets even more complex if the object is not pure.

The only solution is providing people with a way to define and refine
the semantics of their operations: to specify the invariants
that are required to be preserved and the variants that ought to decrease,
and to verify that everything goes well.
That is, a generic portable code migration format requires -- reflection --.

Now, let's imagine we got semantics right,
and could identify the things that are statically or dynamically valued
among dumped object. What about efficiency in
	(define a (complicated-expression-with-context-dependencies))
	(lots-of-things-using a)
	(define heavily-used-func
		(if (recognized-hardware) optimized-version generic-version))
Will you require systematic recomputation from dynamic variables everytime,
because the context might have been changed since previous time?
What if the context changes in the middle of such recomputation?
What if the "context" on which things depends is an aggregate
of things that change quickly, but a bit at a time,
and of things that change slowly, but a lot at a time?

To me, the solution lies in neato cacheing/invalidating techniques,
in dynamic code generation/specialization/partial evaluation
together with dynamic code discard/invalidation/trap.

> My intuition is that this problem is just
> too hard to solve in Scheme -- but then there are languages like Oblique
> which, as I recall, solves the problem using explicit declarations.
I admit I do not know Oblique. Will have a look someday.
I you're interested in attacking this problem,
maybe you could join your force with either
the lispos project <lispos-request@math.gatech.edu>, or with
my Tunes project <http://www.eleves.ens.fr:8080/home/rideau/Tunes/>.


> -t

PS: BTW, what does ~twaddle become? I like the general idea,
but the implementation seems too secretive and expensive, to me at least.

TUNES is a Useful, Not Expedient System