Tunes compiler

Francois-Rene Rideau rideau@ens.fr
Mon, 18 Mar 1996 04:54:09 +0100 (MET)


   Sorry about my long silence:
I've been sick for a week,
and my modem hates that britannian phoneline anyway.
Down with the France Telecom monopoly !
Now I have to cope up with a 400-message long mailbox.
   And my bureaucratic problems are becoming *serious*.
Ouch. My own fault there )))-:


   So we were a-talking about the features of the HLL- that
we'd implement as a Scheme dialect...


[about possibly symmetric rewrite rules as a way to program,
that encompasses traditional OO techniques,
for expressive enough patterns]

>>    First-class patterns with pattern-patterns and first-class contexts
>> allow neat reflective programming
>> (you can << unify (pattern1) (pattern2) in context (c1)>>)
>> and thus dynamically-defined partial evaluation.
>
> What does the context define here ?
   It defines variable bindings as used/yielded by pattern matching.

> Are the patterns allowed to reference Scheme variables ?
> Does the context you mention here refer only to pattern-variables ?
   Patterns sure should allow to reference some variables,
that can then be referenced in usual computations.
This is needed to define rules.
   So indeed, patterns should be allowed to reference any usual variable,
and referring "only" to pattern variables means referring to any variable.
Hence, contexts cannot be more abstract than patterns,
and first-class patterns mean first-class contexts.


> If patterns are objects which, like all others, result
> from the application of eval with an environnement (context) then all
> variable references should have now been done and replaced by their value.
   You mean that patterns be not first-class objects.
That would be a possibility.
But then, reflectivity will have to be done syntactically,
instead of semantically.
Thus, inverting rules, for instance, could not be done by usual means.
   As usual, restricted use of objects leads to more efficient implementation;
but I would prefer this rather be a feature of the compiler,
that can easily detect such restricted use,
than a limitation of the language,
that couldn't express in another way what first-class patterns allow.
If patterns are to express typing, as we intend,
then first-class patterns is a pre-requisite for first-class classes,
which we'd like to have.


> How do we formalise patterns and rules ?
   A pattern could be considered as an abstract object obtained from
a simple pattern grammar that would be a super-set of the grammar
already used in Scheme define statements:
       (define var value)
       (define (keyword var1 var2) (stuff-with-var1-and-var2))

So a simple pattern could be either
       var
or     (keyword subpattern1 subpattern2)
because patterns are first-class,
keywords could be replaced by any expression that yields a pattern;
subpatterns would be patterns to match components of a complex pattern
(for instance (cons a b) would match any cons cell where a matches
Pattern matching would be done head first, arguments in order.
   A variable symbol would match anything and bind the variable of given
symbol in the current context (note: a new context is generally created
whenever pattern matching happens, but some pattern keywords may change
the context in which matching takes place). Multiple appearance of a symbol
mean that all matched values must match with each other.

   Patterns would exist for all existing Scheme types.
   Exact patterns match exact object: (quote a).
Quasi-exact patterns may be used, too: (quasi-quote (a (unquote b))).
   Atomic patterns could correspond to generative type constructors,
as in
(define-generative-pattern (my-obj (pattern Type) (Type Value)))
that would create a new pattern my-obj, exclusive from any previous pattern,
that could be constructed from a pattern and a value that would match
the pattern.
   Special patterns could be described in terms of a lambda functions
to call when either constructing or matching an object of the given pattern.
Some special exception will be called when pattern matching fails,
(or a bad argument is given), which corresponds to a call to some
error-handling continuation.

   A rule is the combination of a deconstructing pattern
that binds variables by matching input objects,
and a constructing pattern that creates an output object
by constructing a pattern from variables bound by the previous pattern.
For instance, the rule
   (list a b '()) => (+ a b)
would match a list of exactly two elements a and b, and return the sum
(assuming list returns the rest of the elements in last subpattern).
   Inverting a rule consists in trying to swap the deconstructing and
constructing patterns, which in the general case,
is very difficult to do efficiently, when even possible,
but is straightforward to do when only injective patterns are used
with the same set of bound variables.


> Can we mix Scheme in them ? How ?
   I am not sure what you mean by "mixing Scheme and patterns".
If you mean one of the below, sure it should be possible, as described above:
using patterns from the Scheme core (explicit/implicit calls to "unify"),
binding Scheme variables from patterns (unify in some environment),
matching Scheme objects with patterns (quote patterns),
calling Scheme-defined functions as matching function (special patterns).


> Take the rule assemble the add of two registers for exemple :
> Do we write
> (list 'ADD x y) (list #x37 (f x) (g y))
> or
> ('ADD x y) (#x37 (f x) (g y))
It'd rather be something like
    (x86-instruction-token 'ADD x y) (x86-instruction-bytes #x37 (f x) (g y))
Of course, the "x86-instruction-*" could come implicitly from
a macro-definition in the context, instead of being written hundreds of times.

> And what are f and g ?
They are patterns themselves,
used as subpatterns to the x86-instruction-bytes pattern.

> Can they be Scheme closures or must they be special
> functions (our bijections for exemple) so that the types or x and y may
> be infered ? 
I'm not sure what you mean.

> Which side of the rule is/are evaluted by Scheme ? When ?
When the evaluation of the above rule is tried,
first, the argument is matched with the left-hand-side;
if it was a match, then the right-hand-side is constructed;
if no mismatch from the arguments type happens, the
rule was successfully evaluated;
if the rule wasn't successfully evaluated,
then it is a mismatch, and next rule, if any, should be tried.

> Before the pattern variables have been bound ? After ? Both ?
Basically, the left-hand-side binds variables, that the right-hand-side uses.
But each side may also have additional internal variables,
and may use any variable already bound.

> Does that mean we can't match a function application since it will do
> the call instead of interpreting it as a "program" object to be matched.
I'm not sure what you mean. What is "matching a function application" ?
Testing if there is a match between a function and its arguments
without evaluating the function ? Well, in the general case, it sure
is impossible to determine if a function matches an argument without
trying it (perhaps in a protected environment); but in most usual
cases, the compiler should be able to determine enough of it;
also, some construct shall be used to express that if it be reached,
then the rule was successful; perhaps we shall require that if the
left-hand-side matches, the right-hand-side shall match, too,
or have a middle-and-side that includes guard-tests, and
a right-hand-side for actual computations.

> There are a whole lot of possibilities. None of them has impressed me
> so far.
When theory does not priviledge any solution,
only practice may show us which is the most handy.



[About implementing generic functions with a hackish
"message" second optional argument]
>>> This would allow
>>> us to implement high-order and introspective functions on the ruled
>>> based functions. For exemple passing 'invert as the second arguement
>>> could return the inverted function.  Passing '(add-rule somerule)
>>> could add a rule. Then we'd hide these hacks in normal functions
>>> "invert" and "add-rule" which operate on rule based functions.
>>    I'm not sure why such hack should be used.
>
> So that bijections can be both Scheme functions which may be applied
> normally and structures which contain rules.
   Hum. I'd rather have pure functions that transform lists or sets of
rules into functions, and perhaps have standard implementations
for functions that fetch their rule list from a variable at every call
(aka dynamically bound functions).
   This would be much more orthogonal and expressive.


>> Why not have clean first-class patterns,
>> that we can use to match objects,
>> yielding bindings in first-class contexts (dictionaries) ?
>
> My hack is independent of that.
Yes (and I independently dislike it).
> I kind-of agree with what you suggest.


> Except that I'm not you what you mean by a first-class pattern (if any
> object can be a pattern then why do we need to distinguish patterns ?
   Hell. Any Scheme object can be a function,
but all Scheme objects are not functions !
There are ways to distinguish between functions and other objects;
so will patterns be distinguished from non-patterns;
yet patterns are first-class objects,
can be taken as parameter or returned by functions,
matched or built by other patterns.

> But maybe that's the problem. Maybe we would need to define a precise
> pattern language which excludes Scheme but allows _quoted_ reference
> to it (so that it isn't confused with pattern function applications
> etc...) and allows normal pattern-variables references and pattern
> function applications.
   See above for a precise pattern language, that allows to quote
Scheme objects. Nevertheless, I wouldn't like to build two separate
languages, one for pattern programming, the other for usual programming.
Patterns should integrate seemlessly in the usual programming language.



>> Then it would be fairly easy to have first-class rules,
>> first-class generic functions,
>> and write a generic (invert) function to (try to) invert generic functions,
> I'm all for that.
   The inverting function will have to manipulate bindings of the
argument function to invert, which again means first-class bindings.


>> while manipulating rules based would be a matter of cleanly building
>> new generic functions by selecting rules, which would be quite cleaner
>> and more powerful than any 'add-rule hack.
>
> The add-rule function is not a hack. It simply modifies (side-effect)
> a rule collection.
   IMHO, side effects are bad hacks when not necessary.
In this particular case, I don't think they are necessary,
and you give them limited expressivity
(rules can only be added, not reordered, rewritten, or anything).
We could achieve more things cleanlier
by having rules-to-function functions,
that we could successively call with a growing list of rules,
or with rules that have endured any number of transformations,
so as to achieve anykind of rule collection,
not just the kind which only grows chaotically at one end.


> I think that rules should be normal applicable functions also. At least
> eventually. We could have a simple operator "selective union" which
> takes two functions and returns a function whose value at x is the
> value of the first at x if x is in its domain and otherwise is the
> value of the second at x.
   Yup. The "selective union" could be expressed as
(try fun1
 with (mismatch _) => fun2)
   Where "try" is a caml-like exception-catching form,
which means an exception mechanism to mismatch,
and therefore that we need to decide the semantics of exceptions, etc.


> At first, only rules or the result of
> previous unions could be passed to the selective union operator
> because we wouldn't know the domains of other functions but in late
> versions, all HLL functions should have a domain. As such there may
> be no distinction between generic function and functions. However,
> there may be a need to preserve this distinction. That is if we want
> to be able to modify (side-effects) generic-functions by adding
> "methods" (really functions) for exemple and not want functions to
> be mutable.
   I'm not sure there should be any reserved or distinct semantics
for generic functions.
Why not make all functions (generic or not) equal,
and let people have "dynamic generic functions"
lookup a dynamically dispatched set of rules,
and *-generic functions lookup some *-defined set of rule ?


>> Of course, all this needn't be available
>> in the first-step bootstrap compiler.
>
> True of course. But it would be great if the initial step extended
> nicely to the general case.
Yup: we needn't implemented such a sophisticated system all in
first-class Scheme to use it in the initial compiler,
but that initial compiler sure should provide it all to further steps
in the metacompiling bootstrap sequence.
[Let me reintroduce the idea of an interpreter as the original element of
the bootstrap sequence].


[Macros]
> Ok. But they are really equivalent to the macro system in my meta-circular
> interpreter (that I wrote, not the logical Scheme I just sent you).
[Note to all others: it is available near Patrick's page
http://www.cs.utoronto.ca/~premont/tunes/source]

These macros look fine to me.
Adding a (representation x) special form
instead of (quote x) should allow easy hygienic macros.
They also look the same as what is used in elk,
my currently favorite scheme interpreter
(small, portable, complete, robust, extensible, well-documented;
but just an interpreter, not a super-optimizing compiler).


[Patterns that match lambda's]
>>> You want a pattern that will match
>>> any lambda's source like :
>>> (lambda (var args) (var body))
>>> But that is not a property of the pattern matching system.
>>> It is simply a particular pattern. Do I understand you ?
>>    I want the pattern matcher to be aware that in the body,
>> binding were extended with the variables in the lambda,
>> so I can achieve consistent matching of lambda's.
>> This just isn't trivial, and again, I see first-class context as
>> an easy way to achieve just this in the general case.
>
> Could you give an exemple ? I don't even see why you'd want to match
> a lambda and "be aware" of the lambda's argument names in its body.
> When does the problem arise and what is it ?

Imagine my pattern tries to see if some variable foo is used inside
the object to be matched; then it will have to be able in some
explicit or implicit way to distinguish between uses of the object "foo",
and uses of variables named "foo" but bound to anything else by the text;

(define-pattern (anything-that-cannot-possibly-use (atom foo))
   (pattern-union
      (any (atom a) where (neq a foo))
      (any (bound-variable v) where
           ((anything-that-cannot-possibly-use forbidden) (binding-value v)))
      (any (unbound-variable v) where
           ((escape somehow-I-know-that-it-cannot-possibly-use foo) v))
      (any (base-level-construction constructor args) where
           ((anything-that-cannot-possibly-use foo) constructor)
           ((map-pattern
	      (anything-that-cannot-possibly-use foo)) args))
      (any (lambda abstracted-variables body) where
              (abstract abstracted-variables in
	         (anything-that-cannot-possibly-use foo) body))))

   Well, this is just an example to have the idea;
the important point is that whether you have explicit contexts or not,
you must be able to determine whether or not a given variable is bound
in a context, and to explicitly bind new variables in the context.
   The same holds whenever you want to inspect a function to determine
if some feature is used. Of course, the more you can see of a function,
the more it restricts the way you can optimize the implementation,
and conversely. However, I think that this problem can be circumvented
by having multiple versions (source, unoptimized, optimized, etc)
of functions, and discarding those that prove out-of-date or no-more-useful.


>>>> patterns to match patterns, taking contextual effects into account, etc.
>>>> [This is why, again, it seems to me that first-class contexts are of
>>>> foremost importance: to allow to easily construct such patterns].
>>>
>>> I don't see where first-class contexts come into this.
>>    See above: if I want to consistently match and rewrite stuff inside
>> bodies of lambdas, I'd better have first-class contexts.
>
> See above. Please pinpoint the problem.
   Again, the problem is with the fact that bindings that are dynamically
created inside matched objects must be taken into account.
   If the matched text is (let ((a b)) a), the second reference to a
must be properly traced back to its value in matched text.

> And one more question, why do
> we need first-class contexts to solve it ? Or how does it solve it ?
   I need be able to determine or use bindings in a matched object.
Imagine that I'm trying to determine whether some bound variables or
keywords were used in a construct; how will I do ?
I need to maintain a list of bindings.
   Either we manipulate (match/construct) source objects,
in which case we have to manually alpha-convert,
and need maintain binding lists (and hence first-class contexts);
or we manipulate some predigested data (e.g. graphs)
that synthetically represent data and bindings,
and still we need ways to know whether a knot was used
and manipulate bindings.
   The latter case is interesting; surely it could be a good
representation for what is going on internally (graph reduction),
and most of the time, limited use of first-class contexts would
reduce to record field selection in a graph representation.
Perhaps some named-lambda-calculus can remove any need for first-class
contexts.


> Of course there will be contexts in the implementation but do we
> really have to give them to the user and be force to consider and
> implement tons of other uses ?
   Because the system reflective, the user must have access to those
contexts. Actually, such syntactical contexts or dictionaries or
environments, or whatever you call them, may be
what replace the unix "file directory" structure:
they are some all-purpose name-based storage/retrieval facility.


> The Scheme environements and
> pattern-matching environements are not necessarily the same thing and
> there are plenty of other thing the word context could mean.
   Let's say we where talking about those environments that would
be the same thing.

> By saying
> that you want to provide first-class contexts, do you consider all
> these possibilities ?
> I still think that contexts are not well
> defined.  They seem to change meaning with the context :)
   Yeah. Clarification is needed.

> In
> MIT-Scheme there are syntax-tables, environements and storage. Those
> are all contexts yet are very different. All they have in common is
> that they are functions.  So I think using the word context is a sure
> way to blur everything and trying to implement first-class contexts as
> meaningless as first-class "functions" (in the abstract mathematical
> sense). It is so abstract that there is nothing to implement. Could
> you try to use the more meaningful terms like interpreter,
> environement, extent, substitution list, storage, logical structure,
> ...  whichever is appropriate in the particular situation ?
    I do not know MIT-Scheme well enough (still couldn't grab its docs);
but let us say that up to now, I've been talking about syntactic environments,
that bind symbols to values. These values can themselves be actual
atoms (numbers, etc), locations, records, record-specifiers, symbols, etc.

> This is something which has hindered my understanding of what you
> wrote for a while. Note that the definition of contexts in the
> Glossary is also different. It leans towards the logical structure,
> which is clearly not specified simply by an environement.
   I haven't put a definition for contexts in the Glossary. Sure I should !
Surely you mean the HLL/Semantics page, in which I did not give a
definition for context. Surely something in the MOCK style
(described in a previous message) was meant there.
   Yes, there is some confusion there between different kinds of contexts.
Again, up to now, I've been mostly talking about
*syntactical environment*, that associate values to names/symbols.


> Note that I'm focusing on problems here. We agree on many crucial points.
> We should keep that in mind to put the fact that we keep arguing in
> the proper context :)
   Let that not prevent us from working: I have enough real-life problems,
not to add useless Tunesy fake problems.

--    ,        	                                ,           _ v    ~  ^  --
-- Fare -- rideau@clipper.ens.fr -- Francois-Rene Rideau -- +)ang-Vu Ban --
--                                      '                   / .          --
Join the TUNES project for a computing system based on computing freedom !
		   TUNES is a Useful, Not Expedient System
WWW page at URL: "http://www.eleves.ens.fr:8080/home/rideau/Tunes/"