Proposals
Lynn H. Maxson
lmaxson@pacbell.net
Tue, 11 Jul 2000 10:46:22 -0700 (PDT)
Billy wrote:
"You've said this before, and we've disagreed
before. I think you need to clarify the statement
you're attempting to make."
At issue is what Prolog and SQL as logic
programming based specifications languages do.
What they do I do not invent. Both employ a
two-stage proof mechanism, one of completeness and
the other of exhaustive true/false testing. That
two-stage proof mechanism lies at the heart of
every rule-based AI expert system regardless of the
particular language involved.
I did not claim that SQL created its own data. As
Billy says SQL uses existing data, in fact data
whose "source" is specified within the SQL
statement. That source is not infinite, however
large it may be. SQL provides as a result all rows
of the data satisfying the search criteria, an
exhaustive true/false proof process. It need not
actually "touch" each input row, but "behaves" as
if it had.
So we can set SQL aside hopefully allowing it to do
what it does regardless of any claim or disclaimer
made.<g> I say it is an exhaustive true/false
search process because it leaves no "true" instance
out. It can only do that if it "logically"
considers all instances whatever their number.
Now that we have established that SQL does not
generate data (and I made no claim that it did),
only performs an exhaustive true/false test on the
data it is given, we can similarly lay all issues
relative to Prolog to rest. Here again Billy and I
are in agreement and thus both on the right
path.<g> Prolog will produce all true instances
from its input domain.
At this point I would hope it clear to everyone
that we are in basic agreement on what occurs. Now
where we disagree lies in whether an input domain
is finite or infinite. I see finite. Billy sees
infinite. Clearly we must resolve the differences
in what we see in the same thing.<g>Here we will
use his example of "add 3 and 4, and display the
result".
I might want to restate this as "display (3 + 4);"
or in PL/I as "put list (3 + 4);". In this
instance let's use PL/I, which has been my
programming language of choice for the last 35
years.<g> As PL/I has rules let us assume them in
this effort.
In this instance both 3 and 4 are literal values.
In fact both are fixed decimal (1,0), i.e. decimal
integers. Their result is fixed decimal (2,0)
under the rules of PL/I. The value is "displayed"
as 7 instead of 07 as PL/I employs "right
adjustment with left zero suppression".
None of the other "confusion" in Billy's
description exists within PL/I's rules. In fact
PL/I will allow string literals in an arithmetic
expression as well as a mixed-type expression, e.g.
"put list ('3' + 4);". In this example the '3' is
converted from a 'char(1)' implied declaration to a
fixed decimal (1) temporary variable so as to match
the types of the two operands. You can find all
these rules regarding arithmetic operators and
operands in the PL/I Language Reference Manual.
Now PL/I someehow did not make the Tunes language
list though it still enjoys far more commercial
success than most of the entries there. PL/I
employs "sensible" strong typing in that it allows
mixed types within an arithmetic expression,
including bit string and character string operands.
So there is none of the "awkwardness" of pure
strong typing languages, although PL/I also offers
the means for the writer (programmer) to explicitly
control the conversion from one data type into
another.
The point is that like all programming languages
PL/I is rule-based. That it is probably more
flexible in terms of mixed data types and operands
(boolean, string, arithmetic) than any other
language readers of this have encountered speaks to
the resolve of the data processing community in the
early 60's who basically dictated the terms of the
language. Had the DOD (Department of Defense)
dictated it instead of COBOL in the same 60's we
would be enjoying a far better world today.<g>
I regret to inform Billy that at least this part of
his description has been laid to rest as we have
well-defined rules for expression evaluation. Now
to get to the remaining parts.
I would hope that we would agree that the
"remaining parts" deal with automated code
generation, first with generating a list all
possible sequences and second with selecting the
best from this list. This is where I see finite
and Billy sees infinite. I hope that we can settle
it here.
First off we do not generate anything except by
following rules. We have already employed rules
regarding expression evaluation. What we didn't
discuss are the rules regarding the ordering of the
operators and operands of the evaluation. Let's
assume RPN or postfix operator designation. Thus
the ordering of our example expression becomes (3 4
+).
Now relative to any computer architecture we have
different choices when it comes to translating this
into machine code. Let's understand that we have
existing rules for generating actual code from
symbolic code. In fact for any given machine
architecture we have sets of different rules for
the same expression conversion. Regardless of
claims to the contrary they are quite finite sets.
The issue is for any instance selecting an optimum
choice, one choice equal to or better than any
other.
"By your rules we're supposed to generate all
possible programs; however, by my rules that's
ridiculous;"
In a sense we are in agreement here. One, by my
rules we are not supposed to generate all possible
programs. The issue is all possible instruction
sequences satisfying "3 4 +" in a target machine.
Given the attributes of both 3 and 4 (already
determined by our earlier rules), this sets limits
on what + choice applies. In any machine
architecture given the attributes of the operands
the operator is a relatively small finite choice,
for the most part one.<g>
You see a program is larger than an expression. An
expression is larger than a sub-expression. A
sub-expression is larger than its components. It
is for its components after multiple levels of
optimization have occurred for which we desire "all
possible instruction sequences" and from them
select an optimal.
Now Billy introduces other factors than our optimal
instruction sequences that impact performance in
different machine architectures, e.g. page faults
and pipelines. That says that after we have done
the one we must now consider revising it according
to another set of rules in consideration of the
other. We have other evidence in another response
that it is best to allow the software to implement
such rules rather than people (assembly language
programmers) trying to do it on their own. It is
already an admission that the software provides a
better solution.
"Are you familiar with Rice's Theorem? It says
that the only way to do that is to run ALL of the
programs with ALL possible inputs. If our program
had more than one possible input, the answer could
be indeterminate -- we might discover that program
A was faster for x=1, but program B was faster for
x=2. Anyhow, we don't have that problem here,
because there's only one possible input (nothing).
So we merely have to run all of the programs
we've generated, and time them."
Fortunately I hadn't heard of Rice's Theorem and
nor do I have to conform to it. The goal here, if
you can think back that far, is to produce code
equal to or better than the best generated by an
assembly language programmer. Already Rice and
others have made that task absurdly difficult for
programmers operating on their own energy.<g> We
are agreed that software allows us to have more
optimal results more consistently. Such software
is rule-based and as such capable of embedding
within a two-stage logic engine.
If we get all this other "confusion" aside, about
what was and was not said, it gets down to the
practical matter that we can use software to
generate more efficient (as efficient) code than
can an assembly language programmer. Something
else then must account for the performance
difference between a HLL and a LLL. Whatever it is
it is not coding efficiency. That's the point!!!
If we choose then to have our HLL to have the same
performance efficiency as a LLL, we can certainly
do so. The truth is that we have deliberately
chosen otherwise. If we imposed our choice, what
makes up our inefficiency, on the assembly language
programmer his code would run no faster than
ours.<g>
When we read then that something is "50% of an
optimized C++", understand that it is the best of a
deliberate poor choice.<g> For whatever reason--
esoteric, exotic, practical, or otherwise--we have
chosen the lesser performance path. In short we
are not forced by anything other than our own
initiative to travel it.
That means if we choose (and we have a choice) not
to follow it that we can enjoy a performance
"boost" actually not obtainable by a pure assembly
language programmer, certainly nowhere near within
the same time and cost framework.
If anyone here continues to believe that an
"operand operand operator" postfix expression where
the attributes of all three components are known
will generate an infinite number of possible
results, then your view of the universe differs
from mine. Even if you combine this generic form
into all possible expressions which either have or
could appear within a program, you will still not
generate an infinite number. You cannot because
the combinations of non-repeatable sequences will
remain finite. It doesn't make any difference to
software that they number in the thousands or
hundreds of thousands.
If it is going to appear anywhere, it should in
PL/I as it allows a greater variety of mixed
expressions (data types and operators) within a
single expression than any other programming
language. And in thirty-five years of world-wide
experience it has never happened. The rules
dictate otherwise.
So you cannot argue that something that works
doesn't. You can argue that when a number (or time
interval) of results exceeds a certain threshold to
abort a process and respecifying it iteratively
until it no longer occurs. That's the purpose of
rules and their expression in formal logic
specifications. When they don't work as desired
you change them. That's what we have done since
writing the first program.