In defense of Prevalence

Francois-Rene Rideau Francois-Rene Rideau <fare@tunes.org>
Thu Feb 12 07:19:01 2004


This is my take on Prevalence, in response to the disparaging comments
made by RDBMS pundits, as reported by MAD70 on cto.
	http://cliki.tunes.org/Prevalence

Prevalence can indeed be seen but as a cheap brittle implementation trick
to achieve persistence of application data.
But that's really missing the whole point
and the change in perspective that underlies it all
(be first to it "paradigm shift" and become a pundit).
What is interesting about prevalence is that it makes explicit
a different (and retrospectively obvious) factoring
of the persistence problem and its solution.
(And of course, this factoring is precisely
what I've been working on in my thesis.)

At the heart of any application is
the computational model it tries to implement:
the abstract state space and the (I/O-annotated) transitions between states.
E.g. an abstract state space made of bank accounts status and exchange rates,
and transitions being financial transactions.
E.g. a higher-order code repository,
with user commands (within contained areas) from the console or wherever,
and various auxiliary inputs.
E.g. whatever you can think of, in nested or interacting ways.
This computational model is the very essence of the application,
and anything else is but means to implement this application.

Filesystems and RDBMSes provide low-level or mid-level tools to which
you are asked to explicitly, manually map your application's semantics.
Object persistence attempts at providing tools that directly and implicitly
map a subset of the constructs of your programming language,
so (assuming your language runtime and compile-time were properly hacked)
you have your usual tools to include persistence in your application.
Well, Prevalence promises persistence tailored directly
to the application's computational model,
without requiring a hacked language implementation.
The programming language, the data schema, the filesystem,
the I/O devices are all tools to achieve the goal
of building a computing system.
Pundits of various domains may want everyone
to express their problems in their domain algebra.
Prevalence skips these middle-men and focuses
on the essence of application domain:
the state transitions of its computational model.

Note how this view of the world is neutral to data representation:
if your application domain is indeed but a data repository
with completely arbitrary unrelated modifications being done on it,
then a data schema will indeed be the best way to model it;
but even then, Prevalence is a good implementation technique
to persist your data robustly.
In this narrow case, a journal of modifications since day one
may be a poor way to encode the state of the system,
but Prevalence doesn't mandate the journal
as being the only and main representation --
it only provides journal since last full dump as a way
to robustly recover the latest state,
in a way orthogonal to the means to dump and restore memory.
And in any case, note that there are
many potential algebras to describe your data;
despite what pundits say, a relational model needn't be the right one
-- few people encode the structure of 3D objects or music samples
in relational tables.
With prevalence, you don't have to fit your schema to a specific algebra
for which robust persistence was implemented,
you choose whichever representation algebra you wish
-- and maybe even several different representations on each mirror
so as to accomodate for different kinds of queries.
And no, it doesn't have to be an <q>In Memory</q> representation;
it could be file-based, too.
It just has to be a one isolated world kept coherent
from meddling through anything but journalled transactions.

But more importantly, Prevalence makes you focus
on the dynamics of the application.
It makes you think about the atomic transformations of your system.
You may have many representations and change representations with time;
prevalence makes you realize that the lingua franca
between all these possible implementations
will be the very natural language of your system
-- the language of its state transitions.
And incidentally, when you make formal proofs
of declarative properties about a system
(as I have done in the past for security invariants),
this is precisely the kind of things you do:
consider the possible transitions of the system.
Prevalence is all about realizing that the persistence of application data,
as well as about anything else in an application,
is better factored around the abstract computation model of the application
than around the low-level operations of data-storage (be it files or tables)
or other computing tricks.

As for the brittleness of using Prevalence through simple libraries
such as Prevayler or Common Lisp Prevalence
that rely on your manually matching your journal structure
to meaningful operations, well,
mind that the coherence can be enforced with a simple discipline,
and that most importantly, this discipline can be largely automated
-- which is what object databases do, in a way.
Actually, the essence of Prevalence being but
the factoring of robust persistence through journalling, we can see
all (or only most?) robustly persistent implementations of any model
as using prevalence internally, only in an ad-hoc way fit to said model.
Such software infrastructure may relieve developers
from having to manually enforce coherence of the persistence layers,
but then again, it imposes upon them to manually enforce coherence
of the mapping of their application to the persistence layer;
this is but a displacement, which might or might not be a net gain,
but at least isn't the clear-cut gain-without-cost
that specific-persistence-method pundits would have us believe.

As for applicatibility of Prevalence ideas to TUNES,
I think that the factoring brought by Prevalence
is an essential conceptual tool, and a nice opportunity to use reflection.
Reflection can be used to automatically enforce coherence
of the journal structure with the application structure,
while making things explicit at one place
and then implicit the rest of the time.
My bet is that Prevalence plus reflection can achieve cheaply
what expensive object databases provide,
in a way that is tailored to an application.
Reflection is a tool that allows to take advantage
of a good meta-level factoring of software.
Prevalence is part of such a factoring.

Cheers,

[ François-René ÐVB Rideau | Reflection&Cybernethics | http://fare.tunes.org ]
[  TUNES project for a Free Reflective Computing System  | http://tunes.org  ]
The worst thing that can happen to a good cause is, not to be
skillfully attacked, but to be ineptly defended. -- F. Bastiat