[gclist] Garbage collection FAQ: distributed systems
Charles Fiterman
cef@geode.geodesic.com
Fri, 1 Mar 1996 11:45:49 -0600
> Here is an outline for the part of the FAQ on collecting distributed systems,
> file systems, databases, and caching stores.
>
> DISTRIBUTED SYSTEMS
>
> What is different about distributed systems?
> - messages are costly
> - asynchrony: slow messages, slow processors, no single global order
> - failures
> * messages are lost, duplicated, delivered out-of-order
> * processes fail
> * can't reliably detect failures
Missing here is that processors may be unavalable. However you can
allow leaks here if you simply do this. When processor A asks processor
B for a pointer to an object processor B remembers that it exists.
> - consistency
>
> Standard distributed systems algorithms (snapshots, transactions): costly,
> don't scale, needlessly general.
>
> Distributed reference counting
> - the naive approach: unsafe, costly
> - weighted reference count
> - reference listing
> - pro: scales; con: cyclic garbage, no root identity
>
> Distributed mark and sweep
> - the naive approach: costly
> - Hugues algorithm: use timestamps (rather than mark bits); GVT
> - Ladin and Liskov: centralized GC
> - pro: fault tolerant; con: doesn't scale
Distributed Treadmill with synchronization.
- Independent processors do their own gc treating externaly held
pointers as roots. At synchronization time unavalable processors
are regarded as keeping all their known pointers.
>
> Partitioned GC (the hybrid approach)
> - trace each subset part, count (or list) across boundaries
> - Network Objects
> - SSP Chains: fault tolerance
> - Lang, Queinnec, Piquer (unnecessarily complicated): collecting cycles
> - con: no root identity, no persistence
>
> DSM, cached distributed stores (see also: GC of persistent stores)
> - what is different: inconsistent data, variable caching
> - key insights:
> * a subset is not necessarily a process or a set of processes
> * it is OK to reorder events, as long as creation events are ordered
> before destruction events that (may) causally depend on the former.
> - Larchant
>
> GARBAGE COLLECTING DATABASES, PERSISTENT STORES, AND FILE SYSTEMS
>
> What is different:
> - lots of data on disk, I/O is costly
> - inconsistent data
> - transaction rollback
> - clustering
>
> Reference counting and listing, mark and sweep: see distributed systems
>
> Partitioned GC
> - see distributed systems
> - selecting the best partition to trace
> - Naughton
> - Sprite LFS
>
> Tolerating transaction rollback and inconsistent data
> - note the identity between the two problems
> - Amsaleg
> - Larchant
>
> Clustering
> - EOS
>
> --
> Marc Shapiro
>
> M. Shapiro, INRIA Rocquencourt, BP 105, 78153 Le Chesnay Cedex, France.
> Tel.: +33 (1) 39-63-53-25. Fax: +33 (1) 39-63-53-30.
> e-mail: marc.shapiro@inria.fr. http://www-sor.inria.fr/SOR/
>
>