[gclist] Garbage collection FAQ: distributed systems

Marc Shapiro shapiro@prof.inria.fr
Fri, 1 Mar 1996 17:26:05 +0100


(I am CC'ing to Chase because my posts don't seem to make it to the mailing
list.)

Here is an outline for the part of the FAQ on collecting distributed systems,
file systems, databases, and caching stores.

DISTRIBUTED SYSTEMS

What is different about distributed systems?
 - messages are costly
 - asynchrony: slow messages, slow processors, no single global order
 - failures 
    * messages are lost, duplicated, delivered out-of-order 
    * processes fail
    * can't reliably detect failures
 - consistency

Standard distributed systems algorithms (snapshots, transactions): costly,
  don't scale, needlessly general.

Distributed reference counting
 - the naive approach: unsafe, costly
 - weighted reference count
 - reference listing
 - pro: scales; con: cyclic garbage, no root identity

Distributed mark and sweep
 - the naive approach: costly
 - Hugues algorithm: use timestamps (rather than mark bits); GVT
 - Ladin and Liskov: centralized GC
 - pro: fault tolerant; con: doesn't scale

Partitioned GC (the hybrid approach)
 - trace each subset part, count (or list) across boundaries
 - Network Objects
 - SSP Chains: fault tolerance
 - Lang, Queinnec, Piquer (unnecessarily complicated): collecting cycles
 - con: no root identity, no persistence

DSM, cached distributed stores (see also: GC of persistent stores)
 - what is different: inconsistent data, variable caching
 - key insights:
    * a subset is not necessarily a process or a set of processes
    * it is OK to reorder events, as long as creation events are ordered
      before destruction events that (may) causally depend on the former. 
 - Larchant

GARBAGE COLLECTING DATABASES, PERSISTENT STORES, AND FILE SYSTEMS

What is different:
 - lots of data on disk, I/O is costly
 - inconsistent data
 - transaction rollback
 - clustering

Reference counting and listing, mark and sweep: see distributed systems

Partitioned GC
 - see distributed systems
 - selecting the best partition to trace
 - Naughton
 - Sprite LFS

Tolerating transaction rollback and inconsistent data
 - note the identity between the two problems
 - Amsaleg
 - Larchant

Clustering
 - EOS

-- 
						Marc Shapiro

M. Shapiro, INRIA Rocquencourt, BP 105, 78153 Le Chesnay Cedex, France.
Tel.: +33 (1) 39-63-53-25.  Fax: +33 (1) 39-63-53-30.
e-mail: marc.shapiro@inria.fr.  http://www-sor.inria.fr/SOR/