From kyle@arcavia.com Thu, 06 Apr 2000 07:52:12 -0400
Date: Thu, 06 Apr 2000 07:52:12 -0400
From: Kyle Lahnakoski kyle@arcavia.com
Subject: Object Code problem


I have object code compiling to *.class files.  I am sure there are
bugs, but the test suite works.  I will be making an interface (simplest
possible IDE) to the compiler so I can generate executable code on the
fly.  I have written to the list to keep them up with what I was doing,
and to make sure it still works.

Object Code, the semantics, is very close to its the pre compiled
representation.  (There are three states code can be in Source Code, pre
compiled code (like object files), and machine code).  Pre compiled
code, I hope, will be the Prism model for Object Code.  

I currently use unnamed variables to indicate inline expansion if code;
this allows the Prism model to appear much like assembly for easy
compiling.  Here is an example:

Object Code:
	result.SetValue(Add_Integer(Value1=4, Value2=5));

Pre-Compiled Code:
	<noname1>=Add_Integer(Value1=4, Value2=5);
	result.SetValue(<noname1>);

Yes, my example is inefficient, a second level optimizer can clean that
up quite easily (if I ever choose to make one).

Exception modeling is quite different.  To make exceptions semantically
clean, I want to specify blocks of code that the exception handling code
applies to:

	make Exception return ErrorException.NewInstance();
		result.SetValue(Add_Integer(Value1=4, Value2=5));
	stop Exception;		//optionsl, exceptions have variable-like scope

But my compiler expands this, making a branch at every instruction to
catch possible exceptions.  So the above expands to:

	<noname1>=Add_Integer(Value1=4, Value2=5);
	if (<noname1> instanceof Exception) <goto exception handler>;
	result.SetValue(<noname1>);
	if (<noname1> instanceof Exception) <goto exception handler>;

Again, the optimization is at a lower level.  

You can see that the semantic model for exceptions is not the same as
the proposed Prism model for ObjectCode because of the exceptions.  This
is bad because I will have to recreate the code from the Prism Model
soon.

Any one with ideas?  How is variable scope handled?



----------------------------------------------------------------------
Kyle Lahnakoski                                  Arcavia Software Ltd.
(416) 892-7784                                 http://www.arcavia.com



From jiml@inconnect.com Thu, 06 Apr 2000 22:40:51 -0600
Date: Thu, 06 Apr 2000 22:40:51 -0600
From: Jim Little jiml@inconnect.com
Subject: Object Code problem

Kyle Lahnakoski wrote:

> I have object code compiling to *.class files.  

Cool!  Congratulations.

> Object Code, the semantics, is very close to its the pre compiled
> representation.  (There are three states code can be in Source Code, pre
> compiled code (like object files), and machine code).  Pre compiled
> code, I hope, will be the Prism model for Object Code.

Although you are of course free to do anything you want, my vision of
the way Prism should be used is to have each state exist as a Prism
code.  This is how we achieve maximum reuse and flexibility.  Compilers
would operate on Prism models and would use parsers and emitters as
"bookends" at the beginning and end of the process.

I.e., compilation would proceed through the following stages:
Start: Programmer creates source code.
Stage 1: Parser (e.g., "parse-oc") parses source code into
Prism/ObjectCode model.
Stage 2: Compiler ("precomp-oc") compiles Prism/Object Code model into
Prism/Precompiled Object Code model.
Stage 3: Compiler ("comp-poc") compiles Prism/Precompiled Object Code
into Prism/JavaMachine model.
Stage 4: Emitter ("emit-jm") takes Prism/JavaMachine model(s) and
outputs .class file.

The compilation command-line/script would look something like this:

       $ parse-oc myclass.oc | precomp-oc | comp-poc | emit-jm
myclass.class

I imagine the above stages of compilation are similar to what you
already have in your current compiler.  The difference is that Prism
allows you to expose the internals of your compiler to the outside world
in a consistent way.  That allows someone else to come along and use any
part of the pipeline that they would find useful.  Or they can create
new pieces, such as optomizers, and insert them into the pipeline. 
Etc.  You've heard the sales pitch.  ;)

So what I'm saying here is that I think you should create three Prism
codes.  One for each state of Object Code.

[...]
> You can see that the semantic model for exceptions is not the same as
> the proposed Prism model for ObjectCode because of the exceptions.  This
> is bad because I will have to recreate the code from the Prism Model
> soon.
> 
> Any one with ideas?  How is variable scope handled?

I think the easiest way to handle this is to make a Prism code for the
full-blown Object Code, not just for precompiled Object Code.  That way
you don't have to write a complex reverse compiler that goes from
precompiled OC to regular OC.  It also makes recreating the source code
from the Prism model a piece of cake.  I'd be happy to help you with
this.  Your page on Object Code semantics is a very good start.  It
would probably only require a few modifications to become a formal Prism
metamodel.

The Prism metacode is actually designed specifically so you can capture
the semantics of high-level languages.  The idea is that you write just
one parser that strips out all the syntactic cruft, such as parentheses
and infix expressions, and stores just the core semantic information
that was encoded in the source file.  The premise is that once your
parser has done that, it's much easier for other programs to manipulate
the parsed program.


As for variable scope, I dealt with that problem in the C-like code, but
I wasn't particularly happy with my solution.  Maybe you can come up
with a better one.

In the C-like code, every block can have its own local variables, and
those variables are scoped for that block and every block it contains. 
Each block has a _List that lists all of the variables declared for that
block.  Blocks are also nested within blocks.  So the variable reference
subcode looks mostly like this:

_List: (Local Variable Reference)
+---+------------------------------+
| 1 | Integer: * (block offset)    |
| 1 | Integer: * (variable offset) |
+---+------------------------------+

The "block offset" is a number, zero or more, that refers to the block
in which the variable is declared.  Zero is "this block."  One is "this
block's parent."  Two is "this block's parent's parent."  Etc.  Once
you've found the right block, the "variable offset" integer refers to
the correct variable by its position in the block's local variable
_List.  One is the first variable in the list, two is the second
variable, etc.

This approach resolves the scoping issue by using nested blocks and a
referencing subcode that only allows parent blocks to be referenced.  It
simply isn't possible to reference a variable in a block that's out of
scope in the C-like code.

Jim



From kyle@arcavia.com Fri, 07 Apr 2000 08:06:21 -0400
Date: Fri, 07 Apr 2000 08:06:21 -0400
From: Kyle Lahnakoski kyle@arcavia.com
Subject: Object Code problem



Jim Little wrote:
> 
> Kyle Lahnakoski wrote:
> 
> > I have object code compiling to *.class files.
> 
> Cool!  Congratulations.
> 
> > Object Code, the semantics, is very close to its the pre compiled
> > representation.  (There are three states code can be in Source Code, pre
> > compiled code (like object files), and machine code).  Pre compiled
> > code, I hope, will be the Prism model for Object Code.
> 
> Although you are of course free to do anything you want, my vision of
> the way Prism should be used is to have each state exist as a Prism
> code.  This is how we achieve maximum reuse and flexibility.  Compilers
> would operate on Prism models and would use parsers and emitters as
> "bookends" at the beginning and end of the process.
> 
> I.e., compilation would proceed through the following stages:
> Start: Programmer creates source code.
> Stage 1: Parser (e.g., "parse-oc") parses source code into
> Prism/ObjectCode model.
> Stage 2: Compiler ("precomp-oc") compiles Prism/Object Code model into
> Prism/Precompiled Object Code model.
> Stage 3: Compiler ("comp-poc") compiles Prism/Precompiled Object Code
> into Prism/JavaMachine model.
> Stage 4: Emitter ("emit-jm") takes Prism/JavaMachine model(s) and
> outputs .class file.

Never thought of making a Prism Java Byte Code model!  Actually I have
the structure of Java class files in the DBOS, but have not used them in
the compiler.  I have issues making anything, that I don't have to, in
non-ObjectCode (called Hard Code).  I can revisit this compiler and
build it faster in ObjectCode than in Java.



> The Prism metacode is actually designed specifically so you can capture
> the semantics of high-level languages.  The idea is that you write just
> one parser that strips out all the syntactic cruft, such as parentheses
> and infix expressions, and stores just the core semantic information
> that was encoded in the source file.  The premise is that once your
> parser has done that, it's much easier for other programs to manipulate
> the parsed program.

What is really needed is a language just to make these specifications
easy.  It would be close to Yacc/Bison.  The language would specify the
the data structures used in Prism representation and parse the code to
that representation.  I would like to do that, but I may not be the best
person for it.  Anyway, I do not think I will get to that for another
year.



> As for variable scope, I dealt with that problem in the C-like code, but
> I wasn't particularly happy with my solution.  Maybe you can come up
> with a better one.
> 
> In the C-like code, every block can have its own local variables, and
> those variables are scoped for that block and every block it contains.
> Each block has a _List that lists all of the variables declared for that
> block.  Blocks are also nested within blocks.  So the variable reference
> subcode looks mostly like this:
> 
> _List: (Local Variable Reference)
> +---+------------------------------+
> | 1 | Integer: * (block offset)    |
> | 1 | Integer: * (variable offset) |
> +---+------------------------------+
> 
> The "block offset" is a number, zero or more, that refers to the block
> in which the variable is declared.  Zero is "this block."  One is "this
> block's parent."  Two is "this block's parent's parent."  Etc.  Once
> you've found the right block, the "variable offset" integer refers to
> the correct variable by its position in the block's local variable
> _List.  One is the first variable in the list, two is the second
> variable, etc.

Each block should have a list of the variables used, in the order used. 
Then a variable is no more than a place holder.  Moving variables from
block to block and adding variables requires very little update.  OOPS! 
Now you can refer to invalid  (out of scope) variables!  Oh well, I
tried.

One bad thing about exceptions in ObjectCode are the exception blocks
can overlap.  I wonder how much I loose if I force them to be strictly
hierarchical.

I will have to put implementing a Prism ObjectCode model on long term
suspension; maybe resume a year from now.  I thought Prism was close
enough to my ObjectCode sub project to implement.  Unfortunately, it
requires a non-trivial amount of  work that has no immediate benefit to
my DBC project; I am a believer in just-in-time coding.  Please forgive
me for any time I may have wasted on your part.  

I will continue to be around for discussion.  Thanks for all your help.


-- 
----------------------------------------------------------------------
Kyle Lahnakoski                                  Arcavia Software Ltd.
(416) 892-7784                                 http://www.arcavia.com



From jiml@inconnect.com Fri, 07 Apr 2000 19:50:18 -0600
Date: Fri, 07 Apr 2000 19:50:18 -0600
From: Jim Little jiml@inconnect.com
Subject: Object Code problem

Kyle Lahnakoski wrote:
 
> What is really needed is a language just to make these specifications
> easy.  It would be close to Yacc/Bison.  The language would specify the
> the data structures used in Prism representation and parse the code to
> that representation.  I would like to do that, but I may not be the best
> person for it.  Anyway, I do not think I will get to that for another
> year.

I've considered making such a language -- it would be fairly trivial for
syntactically clean languages -- but I have so little free time I
haven't gone past the conceptual stage.

> One bad thing about exceptions in ObjectCode are the exception blocks
> can overlap.  I wonder how much I loose if I force them to be strictly
> hierarchical.

I'm not particularly fond of the C brand of exception handling because
it disrupts the flow of code.  Maybe a complete redesign of the way
exceptional situations are handled would be a good idea.

> I will have to put implementing a Prism ObjectCode model on long term
> suspension; maybe resume a year from now.  I thought Prism was close
> enough to my ObjectCode sub project to implement.  Unfortunately, it
> requires a non-trivial amount of  work that has no immediate benefit to
> my DBC project; I am a believer in just-in-time coding.  Please forgive
> me for any time I may have wasted on your part.

Don't worry, you haven't wasted any of my time.  But I am curious --
what exactly did you think Prism would provide for you?  I thought I was
fairly up front about its goals and status.

Jim



From kyle@arcavia.com Fri, 07 Apr 2000 22:32:41 -0400
Date: Fri, 07 Apr 2000 22:32:41 -0400
From: Kyle Lahnakoski kyle@arcavia.com
Subject: Object Code problem



Jim Little wrote:

> > One bad thing about exceptions in ObjectCode are the exception blocks
> > can overlap.  I wonder how much I loose if I force them to be strictly
> > hierarchical.
> 
> I'm not particularly fond of the C brand of exception handling because
> it disrupts the flow of code.  Maybe a complete redesign of the way
> exceptional situations are handled would be a good idea.

I have given it considerable thought, and I have come to the conclusion
that the C-brand exception handling is the best way to go.  I have
written a bit on it at:
	http://www.arcavia.com/rd/dbc-html/DBOS.html
The links: 'Exception Handling' and 'Modeling Returns' are the ones
where I consider exceptions.  Maybe you have some suggestions.



> > I will have to put implementing a Prism ObjectCode model on long term
> > suspension; maybe resume a year from now.  I thought Prism was close
> > enough to my ObjectCode sub project to implement.  Unfortunately, it
> > requires a non-trivial amount of  work that has no immediate benefit to
> > my DBC project; I am a believer in just-in-time coding.  Please forgive
> > me for any time I may have wasted on your part.
> 
> Don't worry, you haven't wasted any of my time.  But I am curious --
> what exactly did you think Prism would provide for you?  I thought I was
> fairly up front about its goals and status.

I was making a comment on my unclear thought process, not about Prism.


-- 
----------------------------------------------------------------------
Kyle Lahnakoski                                  Arcavia Software Ltd.
(416) 892-7784                                 http://www.arcavia.com