Proposals

Lynn H. Maxson lmaxson@pacbell.net
Fri, 14 Jul 2000 09:26:50 -0700 (PDT)


Billy wrote:
"The fact that it's impossible to examine all of 
the possibilities renders such an optimiser 
impossible; and that should have been my clue that 
you were talking about something else."

Just when I think we are finally on track it would 
seem that we understand optimizers differently.  I 
understand them as "specialties", belonging to a 
group of IT professionals focused on no other 
activity.  That, at least, is the method that IBM 
has chosen in translating all language source into 
an intermediate language (closer to symbolic 
assembler but still machine-independent) and then 
subjected it to an optimizer code generation 
process.  What is included in that intermediate 
language I think appears in what is designated as  
SYSDATA, another optional output available to the 
compiler user.

The point is that those responsible for optimizing 
code generators are in effect "machine code freaks" 
in the class of the best assembly language 
programmers you have in mind.  I think you will 
agree that they have only two methods of discovery 
of optimal code sequences, one by chance and the 
other based on the logic of their experience 
(heuristic).

If by chance, they have no means in their lifetime 
or in a week's salary of discovering what a 200MHz 
Pentium machine can in that same week.  In fact 
what the machine may discover is the "optimum of 
the week" to change a previous "optimum".  If you 
insist that a best assembly language programmer has 
a better discovery process that is not 
heuristically based and thus transferable to a 
machine, be my guest.

If it is not by chance, then it is by heuristic and 
thus specifiable.  Whatever is the optimum 
determined by any heuristic is a member of a 
(finite) class of optimums determined by the set of 
all "best" assembly language programmers.  If you 
want to claim that this set of choices is infinite, 
I will assert that you need an infinite number of 
best assembly language programmers to produce it. 
Anything less is finite and anything finite is 
certainly well within the capacity of my 200MHz 
machine to examine this week or next week or 
however many weeks under the human effort is 
exhausted.

In either case the best is the best (even if only 
temporary) and whatever is the best is specifiable 
in machine logic which must follow the rules of 
formal logic.  Now the machine (or the software 
executing in the machine) once encoded is always 
running at its best.  That is something normally 
not possible in any human being: to always be at 
their best.  So in an exhaustive showdown in which 
one is always producing its best a million times 
faster than someone who isn't, chances are (perhaps 
a million to one) that the former will win out over 
the latter.<g>

The point is that once a human has completed a 
discovery process he develops a "short circuit", a 
heuristic, that is logic (rule-based).  In his 
lifetime if he is any good at all, he will discover 
many hundreds and thousands (again not infinite) of 
such heuristics as well as heuristics (rules) which 
govern the choice of heuristics.  He doesn't go 
through all this to suffer from what he knows is 
"human fallacy" ("to err is human").  Thus he 
entrusts it to where no human fallacy occurs, to 
software that runs on a machine.

I will assert to you that the number of such 
programmers is not infinite nor is their output.  
All of them together cannot discover by chance what 
a single 200MHz machine can do within the same 
fixed interval.  If their "improved" heuristics 
gets translated into software, they cannot 
consistently match its ability to produce the best 
every time.

Thus if the software is written to pursue any 
chance discovery process used by humans or to 
pursue any heuristic developed by humans, there is 
no way in hell that any human can match it in 
either accuracy (each time production) or volume.  
Moreover there is no way anyone can develop a 
"secret" heuristic and keep it that way in 
practice.  Once it exists in machine code the 
secret is out: it becomes another among a host of 
heuristics.

Now can certain assembly language programmers 
produce code that executes faster than that 
produced by a specific compiler.  Yes, but only if 
that compiler's code generation process has not 
taken advantage of the previous discussion.  The 
problem lies in seeing that every compiler gets the 
best in code generations possible.  Giving some 
thought to the matter you will (I think) come to 
the conclusion that it is not a technical problem 
(for certainly such code generation exists 
somewhere) but a political and economic one due to 
the competition among compiler vendors.

Now let's get to the matter of code optimization.  
The first level of optimization lies in "expression 
reduction", along the same lines that we have 
learned in our mathematical education.  The second 
level of optimization lies in the execution order 
of the reduced expression of the first level.  To 
make this clearer all such optimizations have an 
expressed form as a directed graph whose inner 
nodes are operators and whose end nodes are 
operands.  To make it even more interesting to 
allow for the aggregate expressions of APL, PL/I, 
and LISP we will allow the operands to be either 
aggregates (arrays, structures, lists) as well as 
elements.

Now to cover everything we could allow the 
additional control structures of iteration (loops) 
and decision to our sequence-only directed graph.  
All that does is increase the number of possible 
paths.  Whatever that number it is always finite.  
It does not make a difference how many times a path 
is repeated, each repeated path (sequence) is 
optimized.

This means we don't give a damn about Rice's 
Theorem as we are not trying to determine when a 
program will terminate.  We are only trying to 
determine that it will terminate under an "expected 
mix" of input in less time than it would 
"unoptimized".  If the mix differs, i.e. outside 
our expectations, then we will react based on the 
number of such instances which in turn changes our 
expectations.  In which case we repeat the 
optimization process.

Now you may believe that you have some human being 
that can beat a machine at this, including making 
the changes as the expected input mix changes, but 
I doubt it.<g>

Just to make it clear you will also note that all 
iterative, decision, and operator (sequence) inner 
nodes involve "expression evaluation".  Quite 
frankly, Billy, there isn't any other kind.  It's 
not that I had an arbitrary choice in the 
matter.<g>

What does get interesting is that regardless of the 
number of sub-expressions appearing with an 
expression that eventually we get to the the 
end-nodes which is an 
(<operator><operand><operand>) triad a set of three 
members whose order (prefix, infix, postfix 
operator) is unimportant.  On evaluating it it 
becomes in turn a single member in yet another 
triad and so on until it results in a single triad 
which when evaluated produces our result.

Regardless of the complications (the number of 
paths) introduced by our control structures any 
program has only a finite number of paths extending 
from its root node to any end node.  Though 
programmers have this tendency to copy a Pacific 
island natives counting system of one, two, three, 
infinity, infinity in fact does not occur.  If it 
did then compilers could not provide us with an 
analysis of the internal structure of a program 
that we can use within a performance and tuning 
exercise to further insure optimality against an 
expected input mix.

Now programs have finite, not infinite, paths.  
Paths have finite, not infinite, nodes.  
Expressions (triads) have finite, not infinite, 
possibilities in terms of machine code set by the 
application of rules.  We can start at the lowest 
triad in any path, optimize it, then consider it as 
a member in the triad containing it, again 
performing the optimization for this path segment.  
We can continue this for all sub-paths in the 
program, reoptimizing all that we have done from 
this point to all end nodes within this sub-path 
until we reach the root node where we again 
reoptimize on the basis of "new" information.

I go through this tedious description to point out 
that regardless of the finite additional time this 
takes software to do I challenge you to find a 
single assembly language programmer that would ever 
attempt this practice on a fixed term contract.<g>  
While there may be an extremely large number of 
nodes to evaluate that large number in no way 
approaches infinity.  Moreover the larger the 
number of nodes (thus the larger the program) the 
less likely an assembly language programmer can 
apply the same thoroughness (and keep his job<g>).

Just to make this interesting, eliminate the 
current compilation scope limit of a single 
external procedure so that you can apply the same 
optimizing process to the set inter-related (data 
connected) programs comprising an application 
system (or an operating system).  Now the software 
tool is unconcerned about size and will perform 
with the same level of accuracy on a thousand (or 
million) modules that it does for one.  Nowhere 
have you ever seen a single programmer attempt such 
an achievement, including full optimization.  Given 
the state of operating systems, whether the 
superlative one of MVS or the obscene<g> one of 
Windows 2000, either lies beyond the possibility of 
any single assembly language (or C) programmer.  
Note that we are still not dealing with an infinity 
of choices.

You can do things your way, creating an infinity of 
possibilities and thus guaranteeing failure.  I am 
not so inclined and thus prefer doing it a 
different way.  That is the way I intended my 
description to be interpreted, not the 
interpretation that you have given them.  Yours is 
certainly valid and well thought out, but it isn't 
mine.<g>

You can throw in Rice's Theorem or anybody else's 
for that matter.  In the real world we program (and 
thus optimize) based upon a set of assumptions, 
including the input mix.  During its software life 
cycle we frequently have to re-optimize based on 
dyamics of its environment, including changes in 
the input mix.  Given the rate of change and the 
volume of global changes necessary to keep pace 
with them (which means we had best have a capacity 
much larger than the "average" pace requires) do we 
leave it up to people to institute the myriad of 
changes throughout or do we leave it up to a tool 
undaunted by the volume?

The tool I propose, the Developer's Assistant, and 
the universal specification language, SL/I, can 
accept as a single unit of work an entire 
application system (or multiples of such) or an 
entire operating system.  It's unit of work is 
dependent upon the scope of its input 
specifications, not an arbitrary limit set by the 
author(s) of a compiler.  The input units, 
specification statements or specification 
assemblies (predicates) can appear in any order, 
i.e. unordered.  Thus you have reusability to the 
statement level, far greater than any 
implementation of any other programming language.

Thus in my system the writer (the developer) and 
his assistant (the Developer's Assistant) are all 
that is necessary to process a set of 
specifications regardless of their scope.  That 
does not occur in any other system (presently).  
All the rules of optimization exist as 
specifications.  Thus changes in optimization 
occurs exactly like changes to anything else.  It's 
all done through specifications.

Now if others are not included in this discussion, 
more's the pity.  I don't set the rules for the 
Tunes distribution list.  In other responses I have 
deliberately kept them private at the responder's 
request.  

Where a major software vendor like Microsoft cannot 
produce a new version of its operating system in 
less time than a year and is faced with a 
methodology that allows a new version daily (if 
desired) it changes the nature of the game, making 
it absolutely unnecessary for any external agent to 
intervene to "protect" the marketplace from 
monopolistic practices.  

In the system I propose such practices are 
impossible.  In the system I propose no need for a 
standards effort exists.  The only requirement is 
that it conform to the syntax.  That also says that 
it is impossible to introduce an incompatibility.  
In the system I propose no need for a FSF or a 
"free" or "open" software license exists.  No 
protections nor restrictions of any kind are 
necessary.  Moreover full (and unrestricted) 
ownership of all source--the tool, the language, 
and the user applications--transfer to the user.  
No software use licenses apply.  The user owns the 
source and the means of converting it to an 
executable, not the vendor.

I am not trying to wear you down as much as I am 
patiently trying to explain why infinity does not 
fit.  I don't have your hangups on this.  
Optimization, particularly of the iterative kind, 
deals with system dynamics quite well.  Programs do 
not have infinite paths no matter what number of 
loops (iterations).  Certainly with the same set of 
heuristics I would trust the machine far more in 
the large than I would the world's best assembly 
language programmer.<g>