Persist or not persist : comment

shapj@us.ibm.com shapj@us.ibm.com
Tue, 26 Oct 1999 11:08:33 -0400


Sorry for the delay in replying.  In the context of this discussion there is
some experience to offer.

EROS is a persistent system with explicitly managed storage (i.e. explicit
deletion).  We have found that persistence really does simplify things, and
(surprisingly) that it considerably improves I/O performance.  Those interested
might wish to take a look at the KeyKOS checkpoint paper (at
http://www.cis.upenn.edu/~KeyKOS/) or the latest EROS paper (top paper on the
papers page at the EROS site at www.eros-os.org)

It does not eliminate the need for explicit saves in all cases.  The common
misconception is that the only reason for explicit saves is to store the
information.  This is not so.  Another common case -- perhaps a more important
case -- is serializing data in such a way as to provide a canonical format for
interchange.  This has nothing to do with whether the system as a whole is
persistent -- the EROS "file system" is a user-level application.  Rather, it's
best thought of as a kind of format conversion.  Explicit save is necessary, but
it's an application issue rather than a system issue.

As to GC, it is frought with peril for serious systems.  Timing variance that is
acceptable in applications is not acceptable in systems as a whole. And don't
hassle me about the incremental GC literature; I know about it.  There are
bigger problems: covert channels and accounting.

If I have paid for the storage, the system can certainly help me locate it if I
lose track of it, but it cannot be permitted to allocate it to anyone else.  If
it reallocates the storage, than third parties can observe storage availability
and detect surprising things about the behavior of my program.  If GC is done,
then, it must ensure that recovered storage is recovered to the proper
allocation pool.  Most languages providing for GC'd memory don't yet incorporate
the notion of distinct logical storage pools.

Finally, at the risk of getting pissed on for being too low level, there are
portions of a system that should not rely on GC.  Somewhere down toward the
bottom the rubber has to meet the road, and the system must either acknowledge
that it operates in a world of bounded resource or risk deadlock.

Fundamentally, my objection to the "that's too low level" meme is that it
effectively prevents the Tunes discussions from ever getting grounded.  It's
important to think about higher level abstractions and improving the state of
the programming world.  Those abstractions won't be useful if they can't be
implemented, and in the end that requires dealing with all of the low-level
grunge.  It appears to me that there is a deep seated denial of this reality
among certain members of this list.

Jonathan S. Shapiro, Ph. D.
IBM T.J. Watson Research Center
Email: shapj@us.ibm.com
Phone: +1 914 784 7085  (Tieline: 863)
Fax: +1 914 784 7595