Compiler & VM (was: Steps)

Matthew Tuck matty@box.net.au
Sat, 03 Apr 1999 20:33:24 +0930


Hans-Dieter.Dreier@materna.de wrote:

>> Yes, the tree would operate in a similar sort of way, with code for the
>> node put in the node itself.
> You mean that you would include information on how to generate code
> within the definition of each node, right? If so, there's a little
> problem: Code generation may depend on the compiler as well as on the
> node. Different cimpliers may produce different code. You got a
> 2-dimensional array of code generator infos.

Compiler?  I thought we were talking about the interpreter?

>>>> This could be complicated by exceptions, which might force the AST
>>>> VM to have some sort of explicit stack structure.
>>> If the stack is implemented by a Ultra object (subject to GC), which
>>> IMO is a must, then exceptions certainly have to be caught by VM to
>>> adjust that stack.
>> What I meant was that we could have implemented something like:
>> class plus extends expression
>>    function calculate : integer = left.calculate + right.calculate
>> ...
> What would "calculate" do? I'm asking because there seem to be recursive calls.

Yes, that's the simplest way to handle trees - calculate gives you the
result of the expression, and the way to do that in the case of addition
is to add the result of the left and right side.  The recursion travels
down until it hits the leaves.
 
> - GC. Objects for intermediate results need to be reachable or they might get
> collected. That would not be good... GC would have to examine the C stack
> otherwise. But the C stack isn't GC friendly. It would involve a lot of
> extra work.

Well there are C collectors out there that do this ...  But an explicit
stack straight away is probably the best thing to do, at least to handle
exceptions.

>> You can't really have destructors in a GC language which uses objects
>> consistently.  Ideally at a matter of correctness you would not close
>> files etc. in a finaliser if possible but have a close method which when
>> an exception got raised, the method caught it, closed the file and
>> propogated the exception up the call stack.
> I referred to the treatment of exceptions in the (initial) implementation
> of the runtime system which will (most likely) be in C(++). Ultra won't
> have a delete facility; no way to corrupt pointers. Maybe we need
> finalisers. Maybe we even include a function that finds all the references
> to a given object and changes them to NULL.

Ahh I see.  Because there's no destructors we don't need to destroy
objects immediately, but rather delay to finalisation.  No problem.

>> In one sense, but we have to handle allocations and deallocations of
>> local variables on the stack, although they could probably be done using
>> an implicit stack as well.
> Methods of the stack class will take care of that. In my memory layout
> proposal there's an example of a stack class (exactly: a stack frame class).
> Ideally, each stack frame should only be allocated once, with sufficient
> space. The compiler would calculate this. Normally no reallocation should
> be necessary. If we really find that VM is too slow, we still can inline
> the stack functionality.

Yeah I know, just saying that it's something that needs to be handled.

>> The compiler can probablly have a fairly minimal syntax at first too.
> Certainly it will have to. But that still will be a lot larger than a simple assembler.

Yes but wouldn't the compiler be easier to use to generate test cases
for the VM?
 
>> Then it can be expanded later.  Essentially the same as what your
>> assembler is, except existing work can be built upon.
> Most of the assembler's ingredients can be reused if we keep that in mind.

Like what?

>> But isn't this work better spend starting with a minimal compiler and
>> working your way up?
> I don't think so. Maybe you want to do some estimate: Just sketch your
> minimal syntax and see what is takes. If you post it here, we can talk
> it over. I wouldn't be surprised if what you call a minimal compiler
> turns out to be very similar to what I call a simple assembler.

Maybe.  I wish I could post one in this message, but time escapes me at
the moment.  I've only just managed to catch up on messages, so the
language is my number one priority.  I'll work out a medium-term
language and a short-term list of features to add.  I will get this done
imminently.

And once that's done, we have to start coding, or we never will.  If I
get a weekend I could probably get a working system up and we wouldn't
have to talk about it.  You can do the same if you wish.

> But even if you mean logical fields, I think it's no good idea.
> A number should be an atom; everything else is overkill.
> You'd have to handle cases like 1.a or .5 or a.5 or 1..0 - not
> me! I'd even include the minus sign.

All I'm saying is that the grain of control can vary from sub-real right
up to text.  And it's an important issue for intelligent editing.  But
that's why I want plug-in views.

>> Even if you wanted to parameterise by table, how are you going to change
>> the table.  Since direct parse table manipulation is difficult, you'd
>> want to set up a parser table generator anyway.  It'd be quicker than a
>> normal compiler, but we'd have to spend time writing it.
> We'd write a parser table generator in due time (rather sooner than
> later). That's a bit effort, right, but it keeps us flexible and
> independent from the C compile chain. I see manual table entry
> only in the beginning.

I've never tried that, but I'd think it would be next to impossible.  By
the same logic I would use a manual parser until the table generator was
ready.

> Would you please explain what you mean by "parser framework"?

Beats me.  I was referring to actually linking subclasses into a parser
framework (abstract class), but when I think about it, I have no idea
how that might work.  All I can think of is a table parser.  So forget
that.  =)

> We need to do most of that for a hardcoded one as well or
> it will remain an isolated affair, not integrated into the
> rest of the environment, and we will need to ship a C
> compiler and yacc or whatever it takes along with our code.

Well yacc generates that parser in C, then the code is linked in with
the rest of the C, so you wouldn't have to do this.

> In the impl, it looks like this if it's not inherited from the interface:
> ...

I really don't want to require any new language features here, which
this might entail.  It can be done via inheritance (and this allows more
than one use of the code).

The other option I can think of is using detached elements.  These can
have different access control, so the main element would be locked down,
and the detached elements would be free to be accessible.  In a view
where the detached elements are added, this might appear as you describe
here, with only detached elements accessible.

This made me wonder whether detached elements could be eliminated since
they have much of the same utility as inheritance.  At least one
difference is that detached elements get to access private members of
surrounding objects.  If we make a member C-style "protected" then not
only can it be access by subimpl, but also by subsubimpls and so on.

-- 
     Matthew Tuck - Software Developer & All-Round Nice Guy
             mailto:matty@box.net.au (ICQ #8125618)
       Check out the Ultra programming language project!
              http://www.box.net.au/~matty/ultra/