[gclist] Re: gclist-digest V2 #110

Arch Robison robison@kai.com
Thu, 22 Jan 1998 16:56:32 -0600 (CST)

> It is *very* difficult to accurately determine the probabilities on
> something like this. The analysis can be much more complex than the code
> it would replace!

Yes, this is certainly true.  In the situation I was in, the analysis
was fortunately quite simple.  In GC systems, accurate estimates
would seem to be hopeless.  My complaint was really with the "1 in 1000"
quote, as it implied an order of magnitude without data to back it up.
> Another factor to consider is the cost of *debugging* something like
> this, if it does arise. I've seen similar reasoning be flawed, and the
> resulting problems were *not* easy to track down..

True.  But if the failure can be automatically and decisively reported, 
this is not a problem.  Yes, I certainly would not use a probabilistic
algorithm that could not report when it failed.

> Yes, your complicated code may have a greater than 1 in 10^n chance of
> being wrong. However, if it is, you have a much better chance of being
> able to test for the defect and fixing it. Once fully tested and fixed,
> the probability of a defect drop to zero. 

This is only true if you have no code updates and ideal machines.
Every line of code written adds to the overall size of the project,
and complicates updates.  The more complicated an update is, the more
likely that a patching error will occur.  The compiler may fail. 
(Alas, I'm sometimes guilty there!)  Furthermore, when very low 
probabilities (e.g. 1 atom per mole) are involved, the "perfect" solution
may make so many more accesses to the network or file system, or use more 
memory, that it raises the chance of hardware failure well above the failure
rate of the "imperfect" solution.

> A good coverage tool, like VisualCoverage from tracepoint.com is 
> invaluable here!

Coverage tools are invaluable, but not perfect.  The complete state space
of any non-trivial program is impossible to cover.  And most coverage
tools only deal with single executables.  Large networks of interacting
processes add another dimension.

> As a general rule, I think that with good testing techniques, the risk
> of bugs can usually be reduced to *below* the risk of mis-estimating the
> probabilities, and the cost can be reduced to a reasonable level.
> But in cases where the downside risk is small (needing to retry an
> operation, say) and the cost of the alternative high, there can be merit
> to this.

Yes.  Probabilistic techniques are a last resort.  And the risk estimates
are critical.
> But I wouldn't want my life-support system coded to this standard.

I'm not sure.  Given two options, a probabilistic algorithm with a short
(alleged) proof of correctness and chance of failure of 1 in 10^23 years,
and a "perfect" algorithm with a very long (alleged) proof of correctness,
I'd want to consider both very carefully.  And probably demand a short 
perfect algorithm, or better yet, a computerless life-support system :-)

Just my opinions on an issue with many conflicting answers.  

Arch D. Robison			    Kuck & Associates Inc.
robison@kai.com 		    1906 Fox Drive
217-356-2288 ext. 56		    Champaign IL 61820
Lead Developer for KAI C++	    http://www.kai.com/C_plus_plus/index.html