Real, live examples, yay! (Tril gets detailed)

Tue, 20 Oct 1998 23:23:42 -0700 (PDT)

No doubt this will raise more questions than it answers.
Then, all good answers do.

On Thu, 15 Oct 1998, David Jeske wrote:

> You are again missing the point I'm trying to make...
> lets' just assume that there is some nice development environment
> which will help the user input the high-level description of the
> program. So _somehow_ we arrive at some high-level description which
> can be in any format we choose.
> I'm asking you to provide:
> 1) the high level description format
> 2) a description of the algorithm in the high-level format
> 3) an example translation from this high-level description down to
>    machine code using your 'holistic compilation idea'
> 4) an example optimization which an be performed by this holistic compilation
>    system on the code while it's being translated, and the resultant
>    machine code...
> Go ahead and choose any target processor you want... or a simplified
> stack based syntax.

Here's an example from IRC.

<hcf> Tril: iv got a question
<Tril> ok
<hcf> herein, speed of the routine is important, the faster the better.
  prog-1, gets one key input from the user, choices ['a' thru 'z'].
<hcf> each choice leads to some separate mode of prog-1. this would be a
  select getkey case 'a'-'z' thing. after a month's use of prog-1,
<hcf> its realized that the user almost never chooses 'd', often chooses
  'g' and rarely chooses 'k' or 'r'. so the select case is re-written
<hcf> to resolve 'g' 1st, 'k' and 'r' later, and 'd' last. if prog-1 did
  this itself, would it be "reflection"?
<Tril> yes
<hcf> i thot of this awhile ago, but never had the correct word for it
<Tril> The behavior you suggest would be more useful if it were made
<Tril> So take your idea, separate it from the specific program, then any
  program can use it
<Tril> Also the user could then set that feature to be automatic: applied
  by default to every program.

> 1) the high level description format

This is the typesystem.  The typesystem == TUNES.  If it's in TUNES, it's
in the typesystem, and vice versa.  Given that, what is the TUNES

The TUNES typesystem is a reflective implementation of higher-order set
theory.  It supports (and heavily relies on) recursively-defined sets and
functions on sets.  (I believe sets are intuitive for users and scalable,
as well as being isomorphic with other mathematical models.)  By sets I
mean an abstract container, with redundant elements or not, ordered or not
(Technically these two properties define four different types: sets,
tuples, bags, and lists, but I am using 'set' generally to refer to any of
those).  I use 'object' to mean a specific type, 'constraint' to mean a
specific statement about a set, and 'property' to mean a type that defines
properties of other objects and allows properties to be assigned or
deleted by moving objects in and out of a set associated with the property
(The idea of properties as sets comes from the predicate calculus, where
properties can be written as p(x), which means both 'object x has property
p', and 'p is a function that returns true when x is the argument'). 
'Context' means 'the sum of all constraints that are left implicit by the
expression under consideration, but explicitly defined at the meta-level.'

How do I use it?
You define new types by combining statements about sets.  

In our example, we are using the alphabet for our data type.  The alphabet
is a subset of characters (Characters are another set, defined over a
subset of integers, with additional string operations.  Integers are
recursively defined in terms of the increment operator.  The increment
operator requires a platform-specific definition.  Platform-specific
definitions are written in terms of platform-specific types, like
registers, opcodes, and so on, which are defined using sets as well.  This
is where reflection comes into play.  But I won't go into that, because
we're not talking about bootstrapping.  Suffice it to say that the
system will work this way after it's bootstrapped!).

> 2) a description of the algorithm in the high-level format

Before we can define the algorithm, we need to define the other type(s) 
that it operates on.  Namely, the keyboard.  The keyboard buffer is a
queue (which can be defined as a special case of a set, actually hopefully
everything can be defined as a special case of a set :) of events.  The
full version of the keyboard buffer records full scancodes (press and
release) and the time they were received down to the millisecond.  [note:
a possible optimization is to specialize the actual keyboard buffer driver
object for a particular program.] Queues are ordered sets with operations
to read from one end and write to the other.  Our keyboard buffer queue is
a hardware object: From the perspective of other objects (which includes
the user and program objects), events just appear in the keyboard buffer
object, because it's implementation is defined in terms of the low-level
architecture (which includes the ability to receive interrupts!).  I also
don't want to discuss how the keyboard object gets implemented as a
hardware object, because that deals with reflection again and it can be
done in different ways.  Just assume that it works, and move on.  We'll
cross that bridge when we come to it in implementation.  (READ: DON'T ASK. 
If you do, post to TUNES-LLL so I can ignore it) 

The algorithm: Read an alphabetic key from the buffer and dispatch on it.

The algorithm can be described by a simple flow graph from the keyboard
buffer, through the set of alphabetic characters (lower-case, even), onto
a list of functions.

More detail:
Stages of program (in "reverse" order):
1. dispatch (given the function list by the program, requires an index
into the list.  This operation is idling until the second argument is
2. convert from the lowercase alphabet to an index (this is required to be
an explicit operation to be compiled, even if the programmer leaves it
implicit.  This is the kind of function that Tunes can automatically
generate to fill in the gaps in program logic.)  The reason I say index
and not integer is because we really do want an index into the list of
functions, which should be an abstract index type, because we don't care
if it's implemented in terms of integers.
3. resume after a single item (this also must be explicit eventually, even
if the compiler makes it explicit itself.  It could be implied from the
context, but maybe not.  Why should the system assume that we are only
dispatching a single character?  Why not continue spawning multiple
functions in parallel as long as the user keeps hitting keys?).  This is
sort of a meta-programming directive already.  The program is making
statements about its own operation.  This 'resume' operation is a
constraint to the evaluator to cut the chain of evaluation.  In fact, this
directive could be placed either here, or between 1 and 2.  [another
possible optimization: considering moving this stage after the conversion
to an index, depending on pipelines and such]
4. filter lowercase alpha (throw out anything that's not lowercase alpha,
but send through anything that is).
5. keyboard buffer

Flow notation:
keyboard --> filter lowercase alpha --> stop after one item --> convert
alpha to index --> dispatch

Functional notation: 


Note, steps that were implied would be inserted after the programmer.  The
program may have originally said dispatch(fnlist,filter(key)), if the
context had implicit disabling of parallel evaluation (to imply the one()
function), as well as implicit conversion.  (of course saying
filter(one(key)) is a bug because you don't want to stop after one key,
you want to stop after one lowercase alpha key, so apply the filter first) 

> 3) an example translation from this high-level description down to
>    machine code using your 'holistic compilation idea'

Assume we start with the expression
dispatch(fnlist,chartoindex(one(filter(key())))), which is an abstract
syntax tree of the flow of our program.  In the development context, it is
just that: an abstract syntax tree.  It has no meaning (although the
programmer intends meaning to be eventually associated to it).  It's just
a tree; pure syntax.  In the development context, you play around with
expressions to form your programs.  This can be done in graphics or text. 
You can import symbols (possibly names, or graphical icons) into the
development context, where they become free of their meaning (to the
system, but you remember them).  So we could have obtained the 'dispatch'
'fnlist' 'chartoindex', etc symbols from other contexts (the modules in
which they exist), or they could be symbols we made up for this program,
which we will worry about finding a meaning for later.  (the development
context here means just some area where the expressions in question are
NOT evaluated automatically) 

Now, in tunes you don't just take the expression and say 'compile', take
the output list of binary instructions, and save it to run whenever you
want to use it.
You COULD do this, but let's assume we're not, for the sake of the
example, in order to demonstrate how it usually works.

We don't really want to get a static expression consisting of machine
level instructions.  All we really want is to evaluate the expression.
We know that the system must be executing machine code somewhere in order
to accomplish the evaluation, but we don't CARE much, and frankly we'd
like to forget about it as much as possible.

We can, however, trace the evaluation of an expression all the way down to
the low level where it becomes machine instructions.  This, then, is
exactly what we want to do.

It is important to note that in TUNES, anyone can do what we are about to
do.  They will be able to trace through, and interact with, the workings
of the system at all levels. 


Now, you said we want to get this to "low level".  Well, there are TUNES
bindings for all hardware, including CPU, memory, etc.  So each of the
hardware components appears as TUNES objects.  To get something translated
into a CPU form, you migrate (think 'drag-n-drop') it to the CPU object.
The specification for the translation operation consists of the source
object (the abstract syntax tree of our program) and the destination
object (the CPU object as provided by the tunes system).  You specify a
"goal" when you initiate the translation.  The system then must find a way
to translate the object selected into the context selected.  Given a goal,
the system can either succeed, by finding an appropriate translation rule,
and applying it, or it can fail to find a rule, and prompt the user to
provide hints.  If the user has no hints, the user tells the system to
give up.  Otherwise, she can attempt to build a translation rule.

A translation rule can contain subgoals, which invoke further
(lower-level) translation rules.  The translation rule is itself an
expression which must be evaluated.  Translation rules are a
generalization of compiling and converting, and are regular programs.

This program is a translation rule from a list of functions to one

Now, really, translation from this expression to the "CPU object" is
implicit.  That's what evaluation is.  So it seems we have gotten no
farther.  We have a translation rule from a list of functions to one
function.  But there's a path to follow to translate.

There are many possible evaluation rules for abstract syntax trees. 
Lambda-calculus has two main rules.  Lazy evaluation: dispatch() is
evaluated first, passing a function as its second argument
"chartoindex(one(filter(key())))".  Strict evaluation:  key() is evaluated
first, its result substituted and passed to one(), and so on. 

But we don't really want strict OR lazy evaluation.  We want holistic
evaluation.  [I'm not sure what I mean here, maybe parallel evaluation of
each part of the expression simultaneously, as much as possible.]

The routine needs to be fast (part of the specification).  Its goal is to
dispatch a function.  Let's assume the dispatch is done by passing the
function to the evaluator.  Since we don't know how this will be done ,
let's disregard this part of the program and simply trace its evaluation
until it chooses the index into the dispatch table.  [At the low level, we
don't know how evaluation will be done.  Actually, it will probably be
done by inlining the evaluator into the generated code at the needed
section, in which case we shouldn't proceed to translate the evaluator to
low-level, since it already will have been.]

[The following is how the logic of the "compiler" would work:]
So, between the keyboard and the index, we need to optimize for speed. 
The speediest thing would be to use a specialized version of the
keyboard queue. In the general case of the keyboard queue, as mentioned
above, it outputs millisecond-precision time stamps for keys along with
two scancodes, one for press and one for release.  This behavior is not
necessary in this case.  So, we specialize the keyboard queue by altering
its specifications temporarily.  In fact, we can specify precisely how
long we need the altered keyboard queue:  Until we get one lowercase
alphabetic character.

After the alphabetic character is received, it needs to be converted
into an index into a specific list of functions.  [Let's assume one
function for each key, for simplicity.]  This is the exact situation where
hashes are supposed to be used, so we'll use one.

Now we can look up the semantics for the low-level keyboard.  The new
logic looks like this:

1. Tight polling loop of keyboard, discarding everything except 'a'-'z'.
2. Hash on the character.

Now we can go into the final phase, machine code:

The compiler finds a description of how to build machine code to do tight
polling loops (general case).  It is specialized with the exit condition
of the range given, 'a'-'z'.  The compiler finds rules to build machine
code to do hashing.  The machine code is generated, then sent thru the
instruction-level optimizer (for processor-dependent pipeline
optimizations etc).  Then we're done.

Installing this code in place temporarily for it to be used, and removing
it again, restoring the keyboard to the system is beyond the scope of this

> 4) an example optimization which an be performed by this holistic compilation
>    system on the code while it's being translated, and the resultant
>    machine code...

The optimization given is to record which keys are used most often and to
put them first in the hash table.

This is part of the specifications for hash tables and will be applied in
the general case.  I.e. part of the hashing code records the values hashed
on.  The generic hash specifications state that the system will
periodically during idle time go back to all hash code implementation and
reorder the hashes according to the frequency of each value.

You can disable all that if the optimization doesn't pay off for you. 
Then you would be specializing the hash functions to remove

Anywhere where I have said specializing I mean any derivation of an
object, by adding, removing, altering, or any expressible change.  This is
superior to inheritance which can only add methods.  All you need to
specialize is the source for the object and some operation to perform on
that source.  Performing operations on source == metaprogramming.  Anyone
or anything in TUNES can do it.  You should too :) 

David Manifold <>