[gclist] Question from a newbie

David Chase chase@world.std.com
Sun, 11 May 2003 17:42:33 -0400


At 01:24 PM 5/11/2003 -0700, Bakul Shah wrote:
>As for any optimization impact, I don't have an answer;
>merely an observation that agressive optimizations use
>assumptions that _break_ the so-called `contract' as soon as
>you bring in debugging.

The contract is the language specification.  If you
can write a program that demonstrates failure to follow
the specification, then you have an optimizer bug.  The
specification, unfortunately, is a little vague in places,
but I can tell you that in practice real programs don't
appear to depend on these various optimizations not being
done.  There are also some cases where the specification
said one thing, and everyone who finally understood the
implications agreed that it should say something else
(this is the case for the memory model, which currently
has the consequence that "reads kill").

>Kind of ironic that people are still optimizing for
>performance and not debuggability even when we have
>progressed from 100Mhz to 2+Ghz processors in less than 10
>years and when systems are increasingly more complex and
>harder to debug.

I personally agree, but people still seem to look at
minor differences in performance when choosing systems.
Never mind how fast processors have become -- any
"industry-standard" benchmark becomes the object of
benchmark cheats in relatively short order.  (I direct
your attention to http://world.std.com/~chase/skeptical.html,
written for NaturalBridge, detailing a number of such
cheats or potential cheats.)  So, sure, it's ironic, 
but it's also a fact of life, and it is counterproductive
to ignore it.

There is also more than one kind of optimization.

Supposing, for example, you wanted to run 750 chat rooms
in the Volano benchmark.  20 users per room, 2 threads
per user, 30000 threads.  In a 2 gigabyte address space
(typical on a 32-bit machine), if you give each of those
threads a 70kbyte stack, you have completely consumed
your address space with thread stacks.  It actually requires
less than that, but you are now definitely in a
world where storage use matters.  It makes sense to:

- apply tail-call elimination where possible.
- "register"-allocate the memory cells in the
  activation record.
- avoid extraneous per-thread garbage.

I've seen 750 rooms run at 4000 messages per second (*) on
a 2-processor, .8 Ghz Pentium III, 1.5Gbyte of memory.
This benchmark turns the generational heuristic completely
on its head, since the "root set" is now about 100Mbytes
large.

(*) Server and client on same machine.

Java implementations would be somewhat different if they
were really optimized for debuggability (*).  Just as it is
possible to note where all the actually-live pointers are,
it is also possible for a compiler to emit, for use in
exception handling, a map that describes the names and
types of all the variables live at the exception site.
The backtrace could include that information.  I
understand that Python does this, and I am told by a friend
who developed applications in Python that it makes for
excellent bug reports from the field.

However, if this were done to Java, programmers would need
to get over the slipshod use of exceptions -- one significant
VM optimization (nowadays) is to make note of exceptions
that never use (print) their stack trace, and hence do not
even need to assemble it.  Adding variable values would
only make exceptions more costly.

(*) If we really cared about reducing the number of errors,
the language would also have changed the fallthru-by-default
for switch cases to break-by-default, and methods overriding
base classes methods would require some "overrides" noise
just like the one that Modula-3 acquired in its second
iteration.  These are the two examples I can remember of 
language changes that reduce the error rate; there are
others, but I cannot remember them.

David Chase