[gclist] Migration (was: More questions for the FAQ)

Francois-Rene Rideau rideau@ens.fr
Sat, 23 Mar 1996 02:57:11 +0100 (MET)


> So it seems to me that if you're serious about providing fault tolerance
> to *users* (e.g., explicit saves), or if you're serious about fault
> tolerance that's general and robust (i.e., graceful degradation), you
> have to worry about keeping checkpoints at multiple granularities
> of time.
   Exactly.
   However, recovery of old checkpoints is particularly tricky
when there are implicit or explicit side-effects,
particularly in a distributed environment,
and critically if you can recover an old version of some object
while a newer version is running,
and that both were meant to use a unique resource.

   What are usual policies about taking into account resources that
are out of scope of the checkpoint ?
   For instance, I'd like checkpointed process P to run back
with exactly what was read from input channel C since the checkpoint,
before it continues with what is now available from C.
But in other cases, I may want to just forget what was read from C.
   Another case is when object O1 has a connection to a unique object O2,
that lies on another machine/resource-space;
you checkpoint object O1 as state O1a, continue to run,
then reach state O1b.
What happens if you want to recover state O1a ?
What should O2 see ?
   And what in a global network if someone wants to recover an old object ?
Surely the whole world will not roll back for every user.
How to manage "object boundaries" in large distributed systems ?

   As for GC, who should keep garbage ?
Surely, people wouldn't want to keep indefinitely garbage that *they*
are not responsible for.
Also, I would prefer not to trust some hosts
to indefinitely keep objects I need,
and have a trusted one keep a copy instead.
   Now, another way to fight memory waste would be to merge
"identical" objects, which would be easier if some repository
managed unique IDs for well-known ones, but would still require
some magic as higher-order fine-grained modules are combined.
This leads me to managing cached partially-evaluations of such
combined objects, and manage the according (local?) weak pointers
in a way such that the process of choosing which cached values to
discard be as independent as possible from the GC algorithm.
   All the costly GC add-ons should of course preferrably be done when
no user interaction is current, so that GC time can be longer than
what human users can bear.

   Surely a full distributed programming language should have options
to customize behavior out of the default ones.
But this may become monstruous to manage,
and some mechanism to do it orthogonally, lazily, and/or interactively
should be made available,
be it only to tune and test everything before it can be called reliable.

   All these questions, we will have to be able to answer someday,
before a global distributed system can exist.
But once again, I feel that the development of
a more expressive modular language,
with high-level formals able to express low-level constraints,
as the first step toward the experimentation of such a system.
   Actually, I feel that
Moving GC, CheckPointed Persistence,
Cached Partial Evaluation, Process Migration,
are all particular cases of something more general,
which I up to now called "Migration",
and consists in being able to dynamically
abstract objects from their context and apply them in another context,
while keeping intact the "meaningful part" of the object.

--    ,        	                                ,           _ v    ~  ^  --
-- Fare -- rideau@clipper.ens.fr -- Francois-Rene Rideau -- +)ang-Vu Ban --
--                                      '                   / .          --
Join the TUNES project for a computing system based on computing freedom !
		   TUNES is a Useful, Not Expedient System
WWW page at URL: "http://www.eleves.ens.fr:8080/home/rideau/Tunes/"