types and operators

Francois-Rene Rideau rideau@ens.fr
Wed, 24 Apr 1996 02:18:19 +0200 (MET DST)


> On Tue, 23 Apr 1996, Eric W. Biederman wrote:
>> Now all we have to do is figure out is what kinds of data types and
>> which opeations on them, our lll will support.
> 
> I. Data types we will _need_
>    A. integers
>    B. pointers
>    C. strings
> II. Optional types
>     A. buffers
>     B. floating-point
>     C. double-length integers
I think all these should be supported at one point.
However, the most important I think is to have

1) "canonical objects",
 an encoding of objects into a single machine word of standard size (C int),
 either an aligned pointer to a structure in memory,
 or a small integer (losing one or two bits of precision to the GC is ok).

2) structures of canonical objects of arbitrary (given) length.
 Constructor cells, linked lists, vectors, etc, can be implemented this way.

3) structures of raw data of arbitrary (given) length.
 Strings, Buffers, floating-point numbers, multiprecision integers,
 can all be implemented through such structures, type-tagged or not.


A simpler system might be that canonical objects be systematically pointers,
so that even small integers should be "boxed" into structures of size one.
As always, we should keep most of the implementation modular enough so
that such a change would be easy to do, so we can always test which is
the best combination of implementation tricks.

Other, more complicated, kinds of structures can come later,
But those basic ones are just required for any system to function properly.

Other attributes can be given to objects,
such as being pointers being weak,
object being lazily evaluated,
some kind of destructor being needed,
the object being pure, or having some side-effective subobject included,
etc.
But all these attributes can be added independently.


> There will need to be some thought put into stack operations. I was 
> planning on keeping the standard Forth ones. Any comments on this?
When there already exists a sensible standard, we should follow it.
Let's follow ANS FORTH words for stack manipulation
(unless you prefer Postscript, of course).
Of course, as always, this would go in a special vocabulary,
so the user can change it if he's maniacal enough.

> Also, I'd like to have a some ideas about how to do pointer/integer
> distinction with a minimal instruction count for operators.
Simple way: have only pointers.
Other way: test low bit(s).
Next way: test high bit.
In any case: use a macro.

> Since + - * /
> etc. will all have to convert LLL integer --> machine integer
> conversion and back, I'd like it to be kept to something we can do with a
> minimal instruction count.
I'd say make it so that "+" is implemented as the usual addition,
with perhaps a slight adjustment.
The same could also be used to do pointer arithmetics
(only valid in transient code,
as we could require all pointers to be canonical at safe points).
   The Caml light runtime tags values with the parity bit,
pointers being unchanged, and integer n being represented as 2*n+1.
   Another solution would add one to pointers, and have n represented as 2*n.
   Or we could represent n as n, and tag with the high bit
(but then, the programmer must be sure never to overflow),
using paging so that pointers be between 1 GB and 3GB.


> Finally, a question of operator behaviour: should operators be type-safe,
> or should they just produce undefined results?
I think we should have both (perhaps in separated vocabularies):
words that check nothing, words that check everything.
When performance is needed and code is debugged,
the non-checking word might be preferred
(not sure: at that time, we might prefer optimized asm-compiled code);
when performance is not needed, we will prefer the security of
paranoid checking.

> For instance: what happens
> when you add a pointer to an integer with the integer addition operator?
Depends how paranoid is the vocabulary you chose.
Also, pointer arithmetics might be useful sometimes.

> Also, if operators are type-safe, what should they do when given the 
> wrong parameters? (Exception-handling?)
Exception handling, surely. Even ANS Forth has it !
Though perhaps some kind of resumable handlers to which you give
the current continuation might be useful...

--    ,                                         ,           _ v    ~  ^  --
-- Fare -- rideau@clipper.ens.fr -- Francois-Rene Rideau -- +)ang-Vu Ban --
--                                      '                   / .          --
Join the TUNES project for a computing system based on computing freedom !
                 TUNES is a Useful, Not Expedient System
WWW page at URL: "http://www.eleves.ens.fr:8080/home/rideau/Tunes/"