[gclist] More questions for the FAQ

Henry G. Baker hbaker@netcom.com
Wed, 20 Mar 1996 15:42:01 -0800 (PST)


> >From: David Chase <chase@centerline.com>
> >I have yet to digest all this discussion about finalization.
> 
> Yeah, me too... I've lost it completely.  I hope you do a great job.  :-)
> 
> >However, I do have some more proposed questions for the FAQ, but 
> >lack complete answers.  Most of these are for the evangelism
> >section.
> >
> >Q: What languages include/require/provide-by-default garbage collection?
> >A: Lisp, Smalltalk, ML, most of the scripting languages, Java, 
> >   Eiffel, Modula-3, Sather, Python, ... and how far should this 
> >   list go?  There's lots of GC'd languages that nobody has ever
> >   heard of.  I have some personal favorites, but that doesn't necessarily 
> >   count as a good reason to put them in the FAQ.
> 
> I'd say it's good to go with popular stuff.  For example, Perl uses
> reference counting if I'm not mistaken.  And Mathematica, I'm pretty
> sure.

I seem to recall that Wolfram never really warmed up to GC.  I think that
Mathematica at one point used refcounts.  Why don't you ask them?

BTW, Maple _does_ use something like hash-consing to 'cache' mathematical
answers.

> One of Henry's favorite examples is PostScript.  (Level II only, or
> Level I, too?  I'm not sure.)  It's nice for people to know they
> often use GC whether they're told about it or not.

PS Level 2 requires GC.  Peter Deutsch is the expert in at least one
version of this (Ghostscript).  PS Level 1 used stack allocation only
(mark/release/Forth-style).

> I believe many popular apps have GC's hidden in them, like various
> editors, spreadsheets and word processors, but I don't know which ones. 
> The more popular, the better.  (Emacs of course, but that's a sore point
> because it does it badly.)  Anybody have good examples?

Joel Bartlett knows of at least one CAD system at DEC that uses GC,
because he put it there.  :-)  You should ask Dan Bricklin & Bob Frankston
about the original Visicalc spreadsheet whether they did what they considered
to be 'GC' to recover memory.  It's quite clear that they do dynamic
memory allocation, though.

I know that M$Word does something akin to GC, because if you save a
file in its 'native/default' format, it saves a lot of garbage & the
file is a lot bigger (upwards of 10X bigger) than if you force a GC by
doing a 'save as'.  Furthermore, if you do editing, and delete some
paragraphs, save the file, and then look at it with a hex/octal
debugger, the deleted paragraphs are still 'there' in the file,
although they probably aren't live.  This is a classic security hole,
because people attach M$Word files routinely to email, not realizing that
those deleted paragraphs which were toned down in the final version are
still in the attached document!!

(BTW, this is a serious security problem with GC in general, which is
that it is _too lazy_ about actually overwriting bits after deletion.
This is why I advocate 'linear' types in addition to GC, so that people
can force prompt deletion for certain important types.)

> File systems usually use reference counting, which is good for people
> to know, too.  Makes the general idea of GC seem less flaky.

File systems use GC all the time.  It's called 'recovery after a crash'.
In Unix, it's often called fscheck, and it does a full tracing GC.

In every DB system I've ever seen, whether originally designed that
way or not, it eventually has to have a full tracing 'consistency
checker'.  Sometimes it's called DB 'reorganization', in which case it
is often a _copying_ GC.  After my RTGC CACM paper was published, I got
this nasty letter from some DB type person who said that it was his idea
to do incremental DB 'reorganization' to allow 'real time' transactions
to continue.  So even though you won't find very many _published_ references
in the DB literature acknowledging that they ever heard of GC, it is clear
that some idea transfer has taken place.

-- 
Henry Baker
www/ftp directory:
ftp.netcom.com:/pub/hb/hbaker/home.html