[gclist] What does a garbage collector do about

David Chase chase@world.std.com
Sun, 28 Jul 2002 23:16:06 -0400

At 3:11 PM -0400 7/28/02, Greg Hudson wrote:
>On Sun, 2002-07-28 at 14:28, David Chase wrote:
>>  I am looking at this from the POV of someone providing libraries
>>  to clients.  If someone doesn't explicitly close the file, whatever
>>  might have gone bad on the close, is still out there.  The close is
>>  just the bearer of the bad news, and the problem was not avolded by
>>  not doing the close.
>Yes, but you can recover from it gracefully, e.g. not report that a
>piece of mail was successfully enqueued for delivery.

I'm not following this at all.  If finalizers don't do closes, then
nothing gets reported at all.  If the finalizers do closes, and
if the client code leaves a file unclosed, and there is an error
on the close, then the finalizer can either not report the error,
or try to report the error without much context.

I think I see a difference of assumptions here.  If I am writing a
library, I may wish to proclaim that correct client code should
take care of closes, but it is risky and unfriendly for the library
to not reclaim unclosed file descriptors in finalizers anyway.  The
alternative is to leak file descriptors, either to the OS (and by
what rule is it that the OS reclaims the file descriptor?) or
permanently, if there is no OS.  I absolutely agree that whenever
possible, and especially in code that is important, the client code
should close files.  However, the library does not have the option
of leaking, and cannot assume that all client code is correct.

>  > In the case that you describe, I think that the AFS file objects
>>  should contain enough state to get new tokens and get the job
>>  done, if this is important.
>Sorry to argue based on an esoteric example, but this makes no sense.
>AFS tokens are life-limited for security reasons; you would need the
>user's password to get new ones.  Code which doesn't check close() error
>codes runs the risk of not knowing when a file couldn't be written, and
>perhaps acting on a bad assumption as a result.

"State" includes enough state to ask the user for a password.
I would not be a happy user in this case, but the other options
available (given that the file descriptor was leaked) are:

a) don't do the close (no close in the finalizer)
b) don't do the close (close in finalizer failed)

These are not better.  You seem to be arguing that if I remove the
close in the finalizer, then the leak will go away (and that this
will be a better inducement to programmers than telling them that they
should close their file descriptors, and provide them tools to help).
If finalizers don't close file descriptors, then (in a world of
fallible but well-intentioned programmers) the net result will
be more failures to close files -- exceptional cases won't get
properly tested and will slip through.

>  > I think it is the best of the bad models currently available.  It replaces
>>  atrocities like signal handlers preempting the current thread, wherever
>>  it happens to be.
>Eh?  Synchronous signal delivery doesn't require a multithreaded
>programming model; it is just as compatible with (for instance) an
>event-handling model.

Signal handling is not generally synchronous, and the actions that
can occur in a signal handler are quite limited.  I think we are talking
about different things.  Can you give some examples of synchronous
signal delivery?  Asynchronous examples (from C) include timer
expiration, user-defined signals, keyboard interrupts, i/o completion,
and (sometimes) floating point problems.

IF you have a GC, and if it has weak references
in the style supported by Java, or if it has finalizers, you generally
have threads.  The finalizers really don't have other options for places
to run.  Invoking them at the GC site (which is not well-defined, it
could be a load from a field) runs
the risk of running them in a peculiar synchronization context.  Leaving
a queue of work to be done when the client code can poll it is about the
only possibility I can see that doesn't use threads, and that sounds
tricky to me.

If you are trying to write something like a server (just for example)
it is an enormous simplification to use a thread-(or two)-
per-transaction model.  I would not like to think about or write one of
these in a single-threaded system, especially if I hoped to get some
performance gains from a multiprocessor.  Now, this does require a
threading system that can support a lot of threads, but since I have
personally written one, I think that such a thing is an option.