Steps
Hans-Dieter.Dreier@materna.de
Hans-Dieter.Dreier@materna.de
Thu, 11 Mar 1999 15:20:56 +0100
--2xXRLRKzbWQWqkw4MFfL6OyKDo7NPCpb
Content-type: text/plain; charset="ISO-8859-1"
Content-transfer-encoding: quoted-printable
Matthew Tuck wrote:
>> At first, write MCs in C and use them as components (or VM instructions,
>> or functions, if you like), which are called by VM that executes a
>> "script" (an object containing VM-interpretable contents) loaded by
>> object loader. In this phase it is mostly thought of as a means to test
>> the MM, the VM, the MCs, to play around with alternatives and to whet ou=
r
>> appetite for more. Similar to the Forth approach AFAIK.
>Yes I think the stack implementation is basically like Forth.
That's true, but I rather meant the component approach - AFAIK every keywor=
d is implemented by some function. There is hardly any syntax. They extend =
by adding more functions, thus more keywords. =
Lisp is similar in this respect, too.
>Where you say the VM is simple and you put the logic in the MCs I would
>say this is pretty much the same as my approach except that I would have
>some VM code, and all the logic in the node classes, which would be
>statically linked.
>
>This could be complicated by exceptions, which might force the AST VM to
>have some sort of explicit stack structure.
If the stack is implemented by a Ultra object (subject to GC), which IMO is=
a must, then exceptions certainly have to be caught by VM to adjust that s=
tack.
>>> I think we should keep the VM. We need this both for when you want
>>> quick compile times, and especially for debugging. It doesn't
>>> necessarily become redundant.
>> I'm not so sure anymore whether we need a VM at all. I was surprised
>> when I tried to figure out how a VM executing a tree (instead of linear
>> code) would look like. I simply found nothing that a VM (defined as a
>> "main" program which controls execution) could do what a function to be
>> called from the operators could not do better.
>
>Things like executing statements, computing expressions and routine
>calls are really easy - things like exceptions and declarations could be
>a little harder.
At least in C exceptions (meaning catch & throw) are pretty easy. Apparentl=
y in C++ it's much more complicated because thas has to take care of destru=
ctor calls as well - but if we simply use the try-catch feature of C++, it'=
s all done for us. We just have to store the info neccessary to unwind the =
stack in case of an exception and do that when an exception occurs.
Declarations are not dealt with by VM.
>>> Why do you feel the implementation using a stack VM would be better
>>> than an AST VM? I feel we should get a VM up as fast as possible. You
>>> would have to design an entirely new utility and language with your
>>> proposal.
>> Utility =3D VM? Well, as I mentioned, a simple VM could be done in a laz=
y
>Utility =3D Object Assembler, ie way of generating input.
A compiler is a new utility as well and much more complicated. It would tak=
e much longer to implement than an assembler. Just look at the syntax: An a=
ssembler like the one I imagine really has a tiny syntax that can easily be=
built in without using parser generators.
>> Well firstly, if we provide a lisp like syntax for object loader, it can
>> produce an AST as easily as flat code. It's no big deal to parse a
>> parenthesized list.Secondly, I would prefer an assembler, since the
>> runtime library, for example, will be tested using simple examples
>> anyway.
>
>I wouldn't necessarily say this. In my regression testing I use such
>thing as large random arrays that are often tricky to set up but work
>great for ferreting out bugs.
One could write a OL program to generate objects for test cases. Or write a=
C program or shell script to generate a large OL program that *is* the tes=
t case. These could be kept for a while (and adapted if neccessary) for reg=
ression testing. IMO OL won't change as often as a compiler would, so this =
option is more realistic of OL than for a compiler.
=
>I understand that a compiler might be a little harder than an object
>assembler to write, but I would use an AST dumper instead to test my
>compiler. And AST output is pretty easy. Also, the scanner of a
>compiler is usually tested before the parser is written.
We would need a general object dumper as well. That would be used to test t=
he OL. It could be refined to produce a nice dump for an AST. Later it coul=
d be part of the editor.
>> These can easily be produced by an assembler-like object
>> loader. Which can be written and changed fairly rapid. How do you test
>> the components you need for the compiler if you have no compiler? You
>> would have to rely on untested tools to test your other untested tools,
>> and test them all at once. I always found it better to do my testing
>> step by little step
>
>Untested tools relying on untested tools? Like the stack VM relying on
>the object assembler for instance? =3D)
The stack VM you mention is no good example since it is so simple that it i=
s barely visible. But in principle, you are right, of course. Certainly the=
minimum starting set most be debugged all together. I'm just pointing out =
that this will be easier if this set is smaller. IMO this is the case for a=
simple assembler rather than for a compiler.
>>>> You replace the whole parser (by a newly generated one), which you got
>>>> to C-compile and link into the runtime environment. IMO that is not as
>>>> flexible - the changes that are needed are more fundamental, leading
>>>> to another executable each time the language is changed.
>>> OK, that's fair enough.
>> What do you mean by that?
>
>Well the main thing I was thinking through was that even a structural
>editor has to have some sort of parser - even if only for decimal
>numbers. Well, maybe you could split it into two halves, but asking a
>programmer to handle this I think would be a bit much. Anyway, it would
>be useful to have the ability to have several different parsers loaded
>simulanteously.
Split what in to two? The parser? I'm afraid I can't follow you here.
Having several parsers "loaded" simultaneously would mean to have them link=
ed into the executable, right?
You'd need to relink even if your test setup changed as far as I can see. I=
wouldn't like that much hassle. Instead I'd prefer just to change a simple=
text file (containing OL code) and be able to reuse the same executable fa=
irly often. But maybe I missed the point you were trying to make here.
>This dictates have a parameterisable parser. I don't necessarily mind
>doing a recompile, so I might like a "parser framework" rather than a
>table-driven parser, but it certainly needs to be flexible, which
>dictates taking the parser code away from the syntactical details. A
>parser generator might still be able to do this though.
Well I do mind doing recompiles if I can avoid them. I really like small an=
d fast test cycles.
>> Sure, but then it's a one-shot. If you later decide that you want to do
>> changes that might be done with less effort by changing the wizard's
>> input, all your modifications you did to the wizard's output are lost.
>> How annoying! IMO a wizard is really useful only if:
>
>Essentially you would use wizards or DSLs because they speed you up. =
>Sure, occasionally you might have to rewrite without it because it won't
>support what you want, but does this amount of time outweigh the time
>gained?
That might be different in each case.
>> a) Its output is perfect. Most likely because it is simple. But then, wh=
y
>> use it at all.
>
>Because it performs a common task quickly.
Yes, but then maybe the way you have to perform the task without the wizard=
is less-then-optimal and needs reengineering.
I'll give an example:
In VC++ there is a class wizard which allows you to add/remove a member to/=
from a class. This saves you work because usually C++ requires you to do it=
twice: in .h and .cpp. If C++ were designed sensibly, you would just have =
to write that declaration once. Using a wizard would not save any time, hen=
ce no wizard would be required.
Some wizards mend insufficiencies which should not have happened in the fir=
st place. The lession to be learned from this is: If you see that you might=
need a wizard, first check your architecture critically and make sure that=
you don't try to cure the symptoms rather than the sickness itself.
>> c) It has plenty of hooks where you can specify your own code. Means a
>> lot of work on the wizard's side, a complicated wizard interface and
>> careful thinking about future needs.
>
>Essentially this isn't really hard. It's just a matter of generating an
>type/impl which gets inherited, hence allowing filling in abstract
>methods on the level below. You can't change the code - but you can
>override it.
I'd prefer another approach for the task you mentioned:
Inside the class (impl, sorry) that inherits from the interface, supply a v=
iew into the interface class. Mark the items that this view displays by a d=
ifferent colour so that the user can distinguish them from item that are re=
ally present in the impl class. Allow him to add function bodies while keep=
ing the type signature inherited (i.e. noneditable). Every time the interfa=
ce changes, the impl will be recompiles anyway. Both parts (intf and impl) =
can be seen simultaneously. No wizard is needed. The user never needs to ed=
it the inherited part from within the impl. In fact, he can't.
>> What is a DSL?
>
>Domain-specific language. Essentially written to do certain things
>well. They're often specificational in nature rather than imperative or
>even functional or logic.
I see. Input to a parser generator (or syntax table generator) might be an =
example, right?
>> Maybe the advantage in handling as well. The more steps we have in the
>> pipeline to get the finished product, the more possibilities for
>> problems. The build process tends to get more and more complicated, so
>> we need a make utility. That saves a lot of work, but also introduces
>> its own complexity. I prefer short pipelines, using small, self-written
>> tools.
>
>Of course fewers steps are better, but the question is, is there a
>better way? If the answer is yes, we want to change it, but we can't
>necessarily do it right away.
True. But we always should give it a second thought that might save us a lo=
t of work.
>
>> If I could avoid having to pipe the stuff through the C compiler
>> and the linker, I'd feel better.
>
>So would I, make no mistake about it. But after the inter-module
>optimisation stage we can essentially say "do what you want from it from
>now on - GCC, JVM, interpreted AST, native, whatever". We can move from
>one to the other pretty smoothly since they just take an AST. We
>currently have limited programming time.
Of course. Well, most of the suggestions I'm making now concerning the way =
to do it are intended for the near future. What comes later is another issu=
e.
>>>> A propos debugging: How do you show the correct location inside the
>>>> Ultra source if there is an intermediate C code level? You got C code
>>>> on one side and machine code at the other - how do you match locations
>>>> in C code to machine code?
>
>Probably with difficulty. We could possibly generate some code to
>delimit statements. I don't see a full-on debugger for a while though,
>so hopefully we'll have someone who knows a bit more about one by then.
Yes, but why use such a kludge as instrumented code at all? I say: Debug C =
using the C debugger, and debug objects (VM calls, the stack, such things) =
using an object debugger as soon as it is available. Try to minimize C debu=
gging by keeping the units written in C (even in generated C) as simple as =
possible.
>>>> If all goes through the C compiler, the debugger must be capable to
>>>> handle machine code, which makes it platform dependent and forces it
>>>> to deal with things like software interrupts and the machine stack. If
>>>> you use the debugger that comes with the C compiler, then there is no
>>>> integrated environment any more because that debugger is not
>>>> Ultra-aware.
>
>We could initially implement a AST-interpreting VM to do debugging. =
>Plus, I think copious assertioning could greatly reduce the need for a
>debugger, although it certainly does not eliminate it.
I agree.
>>>> In contrast, using threaded code, it's easy (as long as you don't
>>> ...
>Hmm, should have asked this earlier, by threaded here are you referring
>to multithreading? If so, how does this relate to the stack machine?
No. I can't remember where I read that term, it must have been a long time =
ago. Basically, it means code that consists of a stream of references (poin=
ters) to operands and / or operators. All "instructions" have same length s=
ince they all are pointers. The "instruction space" is as big as the addres=
s space and very thinly populated, thus carrying little information and was=
ting a lot of space. Different from byte codes or machine instructions that=
have to be interpreted and may have variable length. Exactly what that fla=
ttened AST code for the VM is, except for the NULL which was interpreted by=
VM.
>> I really haven't thought it through to that extent. Help on the language
>> ...
>Maybe, we've missed each other here. I was referring to help for the
>language as you might bring up in another window.
I agree.
>Definitely library documentation could be stored inline. It should be
>fairly simple to collapse and expand both the code and the
>documentation. Auto-generated documentation is better of course.
Yep.
>I think the juxtaposition of these paragraphs which have diverged has
>confused you as to what I was saying. I was referring to developing in
>the editor.
>
>You seem to be talking about debugging although I'm not exactly sure, so
>I may as well explore the situation. If you had generated code in the
>translational hierachy, you could debug at that level rather than the
>source level, or you might debug at both at the same time, provided the
>relevant source and generated languages have a view that supports
>debugging.
I'd like decent help inside the editor as well as inside the debugger. In f=
act, I see the (Ultra) debugger as an extension to the editor rather than a=
s a standalone tool. If the editor can display general objects, half of the=
work is already done, since then it is able to inspect ASTs as well as VM'=
s stack and VM's current state (which would be stored in an object). And ch=
ange their values... and maybe even trigger a compile-on-the-fly while the =
program is still running... (VC++ can do compile-on-the-fly in some cases. =
I was really surprised that it is possible even in C++).
>In fact, I originally formulated the translational hierachy system while
>trying to find a way to view generated code within the editor framework,
>since it's another language, rather than just a view. And then putting
>languages on top of Ultra is a simple step.
>
>Further, you might want good linking between the levels, so you can see
>where code from one level goes in the next, or was in the previous. =
>This might not be simple though, since a small amount of code could be
>distributed throughout the program. Also, there would be some code that
>would have no higher-level equivalent, such as utility functions used to
>implement standard features in the higher level language.
I agree. An example for code that has no higher-level equivalent might be a=
type conversion call that has been inserted automatically. In this case th=
e user *might* have written it explicitly. So it has a representation, but =
it does not appear in the source. The editor might still display it as if t=
he user had written it explicitly (but use another color to mark it as comp=
iler generated), so a breakpoint can be set, the stack can be examined and =
single stepping be performed on it. As an additional benefit, this view mig=
ht be accessible even when not debugging, to show the user (and the program=
mer who is debugging the compiler) what the compiler actually generated.
--
Regards,
Hans-Dieter Dreier
(Hans-Dieter.Dreier@materna.de)=
--2xXRLRKzbWQWqkw4MFfL6OyKDo7NPCpb
Content-type: text/plain; charset="ISO-8859-1"
Content-transfer-encoding: quoted-printable
IDENTIFIKATIONSANGABEN:
a19758a.txt IA5 DX-MAIL X.400 User Agent=
--2xXRLRKzbWQWqkw4MFfL6OyKDo7NPCpb--