Mon, 10 Jul 2000 21:03:15 -0700

From: Lynn H. Maxson []

>Response to Jason Marshall.

>"1) Exhaustive searches (searching all possible 
>outcomes) lead to optimal answers, occasionally in 
>suboptimal time."

>True if we change the clause "searching all 
>possible outcomes" to "as if testing all possible 
>variations of the input".

Unconditionally true, actually.

>Remember this is not my 
>idea, but the one used in logic programming of 
>which Prolog is an example.  This has been working 
>20+ years with an approach that began with 
>rule-based AI systems and has remained the same 

You've said this before, and we've disagreed before.  I think you need to
clarify the statement you're attempting to make.

To help you, let me rephrase this to what it sounds like to me.  I'll put
square braces around my paraphrase.

[Prolog and SQL generate code and/or answers to queries by generating all
possible valid code/answers, and then selecting the most valid

This is clearly false: SQL doesn't generate ANY data, but rather searches
through already-generated data.  Prolog does generate code, but it generates
only one function.  Now, Prolog functions _do_ generate multiple possible
solutions to any given problem, so if that's what you're trying to say,
you're on the right path.

HOWEVER, if Prolog were ever to attempt the solution to a problem of the
magnitude you're dealing with, it would take forever (assuming infinite
resources).  There are just too many possible solutions!

For example, consider the program stated in English: "add 3 and 4, and
display the result".  We'll assume that we only know about console output;
in other words, our input is a blank screen with a cursor in the top left
corner, and our output is a screen with a "7" followed by a cursor.  (If we
don't make that assumption we get more complicated.)

Okay, we start with 3.  How do we get a 3?  Shall we use a literal?  What
type of literal (char, signed, int, hugeint, bitpacked)?  Will the loader
help us?  If so, should we use the loader's help?  Wait, does the loader
leave a "3" bitpattern anywhere in the machine registers?  How about "4"?
Repeat all of these questions for it -- we can't assume that the correct
solution will do the same for both "3" and "4".  Remember to generate code
for every possible situation -- you don't know in advance which one's best.
Don't forget to try them in every register (up to identity; we're being
exhaustive, not stupid ;-).  Shall I mention that to be complete, you DO
have to consider using a -1 literal and building the number you're looking
for using floating point FFTs.

Okay, now, how do we add?  There are several add instructions; and what
about adc (add with carry)?  xor will work here, as will or.  And since the
inputs are known and addition is a pure function, we could just generate the
answer -- so consider repeating the above paragraph for a "7" (but don't
jump to conclusions yet; it might not be the best solution).

Next we add the constraint that the result has to be displayed.  How?  Well,
there's an OS routine which prints the amount of time in milliseconds it
takes to exit -- we could call that, wait 7 milliseconds, then exit.  Or we
can do the "obvious", and print the result of the previous computation.

Wait.  What about NOPs?  We've generated all of these programs; let's insert
NOPs into them.  By your rules we're supposed to generate all possible
programs; however, by my rules that's ridiculous; you MUST use a heuristic
to insert NOPs.  So we'll compromise -- we'll generate all of the
appropriate programs without NOPs, and then insert all possible combinations
of NOPs until the programs reach the next IP page fault boundary.  This
seems reasonable for this program, since speed is your first worry.  It
won't be reasonable for all programs, since the page fault we introduce
might be in a rarely used place, while the pipeline conflict we avoid could
be in the most common place.

Great!  We've now generated all possible programs.  (Sort of.)  Care to
count?  Don't bother, because there are an infinite number of them.  Let's
ignore that, though, and we'll only consider the ones I've implied here.
It's a lot, isn't it?

Now we get to the fun part: choosing which one to use.  Our first criterion
is speed, so we have to decide which program or programs are the fastest.
Are you familiar with Rice's Theorem?  It says that the only way to do that
is to run ALL of the programs with ALL possible inputs.  If our program had
more than one possible input, the answer could be indeterminate -- we might
discover that program A was faster for x=1, but program B was faster for
x=2.  Anyhow, we don't have that problem here, because there's only one
possible input (nothing).  So we merely have to run all of the programs
we've generated, and time them.


It turns out for a program consisting of two instructions this process is
possible iff we ignore possibly useful transformations.  What happens with a
program an assembly language programmer would take three instructions to
write?  Care to imagine?

The assembly language programmer definitely finished before the computer,
BTW.  And it gets worse -- and the computers "optimal" answers get less
satisfactory -- as the problem gets more complex.  Pretty soon, the computer
will blatantly miss "obvious" solutions because in order to arrive at the
solutions it would have to first expand the code using some "forbidden"
technique, then compact it.