ARGON

Alaric B. Williams alaric@abwillms.demon.co.uk
Thu, 19 Dec 1996 22:18:18 +0000


On 19 Dec 96 at 11:12, Francois-Rene Rideau wrote:

> >> Here are some random remarks about the now updated Argon pages;
> >> * "OO". Why have "objects called Entities"?
> >>  One of the advantages of OO is precisely
> >>  that you call anything an object...

> > Purely because the finegrained things inside entities are called
> > objects, and it could get confusing!

> If your OS properly manages finegrained objects,
> then what are these midgrain entities for?

Midgrain... that's a good word. But back to the point: in all my 
experiments with unigrained systems, I find that at either one end or 
the other, things become awkward. Either storage management, call 
binding, size, efficiency, or some other nicety starts to suffer in 
common situations...

> Are you trying to build yet another multi-grain paradigm,
> like UNIX with bloated processes vs low-level C structures,
> or NeXTStep, BeOS, or CORBA with their mid-grain interfaced objects
> vs low-level object implementations...

Yup! And proud of it!

> One of the strong points in Tunes is that at the abstract level,
> there is no such arbitrary barrier between objects of different grain.

The barrier in ARGON is very flexible, depending upon needs... an 
object such as a sound has good reason to exist in either state, and 
indeed that is supported. An class that needs to be able to migrate 
like this simply inherits from "Entifiable" (bad grammar, I know :-) 
and defines a few things like descriptions, stuff like icons, etc.

> Of course, the way objects are implemented is different:
> surely objects will be grouped into logical clusters;
> but this could be dynamic (objects orthogonally rearranging into
> different clusters), and in any case,
> this shouldn't be hardwired in the abstract objects themselves,
> and the casual user shouldn't have to worry about them.

It's not a user-visible thing in ARGON either. The bjects are the 
invisible workers that make entities. Calling them entities gives it 
a nice new feel, anyway; the word "object" is overused to refer to things 
Microsoft software link and embed that don't act a whole lot like 
objects...

> The OS kernel shouldn't ever have to do that
> (standard meta-libraries could).

It's the programmer's choice, where things go...
 
> > The programmers are used to the
> > word object having a specific meaning; the entities are the furthest
> > from that meaning, so I gave them the different name.

> How that furthest?

That doesn't make sense! What I expect you mean is "what do I mean by 
furthest?", so I'll answer that (lookahead cache):

When an ARGON programmer is working away at his opus, the objects he 
is directly working with are the things I call objects. So it's much 
like something more "normal"; ie, C++ or Java. You deal with integer 
objects and string objects and entity link objects. Imagine entities 
as being like files, and writing an application in C++. You use 
certain classes of objects to access the files.
 
> >>  I'm not sure about the terms "online storage" and "offline storage".
> >>  I'd say fast but transient chip cache memory
> >>  and slow but permanent magnetic storage memory.
> > Too specific!
> Ok. That was just to know whether "online storage" and "offline storage"
> were standard terms for that or not.
> Is the meaning obvious to all potential readers?
> Else, a glossary entry is necessary...

How about "memory" and "disk"? It's a bit specific but also a bit 
more meaningful. Heck, we're arguing about labels again :-)
 
> >> * Efficiency of a portable language: it means a very high-level language,
> > Not incredibly high level...
> Then not efficient on all platforms.
> The same bytecode just can't be efficient at the same time
> on 32bit and as well as 64bit as well as 21bit architectures
> (note that a byte is a wierd size for data to a 21bit architecture).
> Whatever "standard word size" you choose for anything encoded in a p-code,
> it will make it an inefficient p-code for architectures
> were this size is too much or too little.

There is no standard word size for ARGOT bytecode; there is no class 
"INT" or anything like that. A little like ADA, you order integers by 
size:

[U|S][1|2|4|8|16|32|64|128|256|...]

U for unsigned, S for signed. I'm debating allowing arbitary sizes 
instead of just powers of 2. That wouldn't cause a problem when the 
implementation uses the next largest size it has (eg U21 -> 32 bit 
word) because overflow is trapped rather than wrapped.

Also, there are:

R[x]^[y]

ie R16^16

means a real number with 16 bits of accuracy, and 16 bits of exponent 
(we're talking huge numbers here!).

Not to mention:

F[x].[y]

for fixed point

F16.16

etc.

And last but not least, the rationals:

[Good abreviation letter wanted!][1|2|4|...]

where the number is the number of bits in each of the two parts.

And perhaps most important are the variable-length types, which I 
have yet to devise snazzy names for... variable length integers and 
rationals will be available.

> >> * How large is your "flat space"? 16/32/64/96/128/256/512 bit?
> > That's implementation dependent!

>    Then either you don't provide a distributed OS,

Perhaps not distributed in the sense you imagine?

> or you require a CPU-homogeneous network,

Close!

> or your address space is not flat!

Nope...

The fact is, the common bytecode makes for a VM-homogenous network - 
rather than CPU-homogenous.
 
> > It'd work with a 16 bit address space. A style cramper, true, but not
> > impossible. In practice, I think it'd be overkill for embedding,
> > though; so strip off the persistence functionality and all that, and
> > the resulting system (hardcoded precompiled entities in ROM with a
> > communications-only kernel) would be perfect for embedding. If the
> > embedded software wants to create large volumes of entities to "drag
> > and drop" to other places, give all the embedded devices a storage
> > manager stub that netmounts a shared hard disk drive.

> Ok. Again, you'll not be using a system-wide flat space:
> addressing changes from computer to computer.

Not a problem. "Addressing" in the bytecode is a case of symbolic 
dereferencing... pretty portable!

> You could still have some kind of system-wide flat space through
> pointer-swizzling, though (see Grasshoppers, Texas Persistent Store, etc).
> I'd rather make the OS purely independent of addressing sizes and formats,
> by abstracting them away from the OS specification
> (every OS implementation could do as it please;
> standard ways would be defined in standard libraries).

Indeed (sort of)...

> > The weak points are bugs in the compiler and kernel. If the compiler 
> > is working OK (and it's pretty simple so it should - not like a C++ 
> > compiler!), then the produced code shouldn't crash unless there's a 
> > memory error or something else physical. And heck, it wouldn't be a
> > problem if an implementation used the MMU. Just the kernel'd be a bit
> > bigger and some things would slow down a bit.

> Remember that we are talking *distributed* OS here;
> physical errors are not rare in any network.
> What if some node or cable somewhere crashes or is pirated?

Things are checked locally; nothing is executed without being 
checked, no matter where it comes from!

> Will that stop *all* the computations until the crash is resolved?

This is the tradeoff with not using memory protection. In a system 
with no parity checking/ECC hardware, a memory error could really mangle 
everything... if this seems to be a problem, or as an option for 
people who need extreme reliability, a kernel using proper protection 
can be created, without any compatability hassles.
 
> >> * "Cooperative multitasking" you mean the user must be trusted
> >>  to explicitly yield execution?
> > Nope. The compiler puts conditional yields in as it produces code.
> > Real low level code - such as drivers in assembler - will have
> > explicit yields. But LL code is few and far between; [...]
> Ok. So you're doing the same as Tunes. Fine!

Great minds think alike :-)
Wish I didn't end up spending so much time reinventing the wheel :-(

> [BTW, did you read the LLL.html page carefully?]

Not in a long while... will it have changed in the past couple of 
years?

> >> * "Persistent entities"
> >>  what you describe looks rather like swap.
> >>  Are your swapped objects really persistent (survive a system shutdown)?
> > Yes. They're written out in a portable object storage format,
> > not a memory dump.
> Well, they could be written as a (mostly) memory dump,
> as long as the dump allows decompilation into a portable format...

An implementation speedup thing. The exact format used on HDD is 
implementation dependent anyway, and will probably include some level 
of hardware dependence, I guess. It's when they're extracted that 
they have to be real portable...

> Again, see the TPS...

TPS?
 
> >>  What about active objects: will they persist if I shutdown the computer?
> > If you tell the OS first! Pulling the plug would cause some loss...
> > Any entities without threads currently executing in them can go strait to
> > persistent storage when you shut the system down. Entities with
> > active threads get something like a signal saying "be gone!". If they
> > don't cooperate, they can be safely forced...
> That's not quite failsafe.
> There exist log-based techniques
> that achieve truely failsafe persistence...
> See the Grasshopper papers for details...

Ok then. Is it some kind of transactional filesystem? I was thinking 
about those, but they made data access seem a little heavyweight for 
my liking. 
 
> >> * Why *byte* streams as a means of communication?
> >>  this breaks any type- and semantics- based security!!!
> > That's what they are /fundamentally/. Only low level code uses them as
> > byte streams. To everyone else, they're object streams...

> Then, you should base your OS interface on object streams,
> or just objects (if your object system is rich enough to express streams),
> not byte streams.

They're defined as such because they need to exist between computers. 
Even to non-ARGON computers - see my remarks somewhere else about 
coexisting with non-ARGON systems. I have no problem with having 
interfaces and protocols that are explicitly designed; and a byte 
stream is, according to current information theory, capable of 
expressing anything finitely expressible :-)

IE, it's pretty future proof. The only niggle is the use of /bytes/, 
a concession to the speed of manipulating bytes on the vast majority 
of hardware systems. Well, the byte has become a standard. And not a 
bad one. I feel safe relying upon it for the next century or so, at 
least!

> >> * Loadable modules: do you have any way to check consistency and security
> >>  of modules??
> > Not at runtime. They must be checked statically.
> MUST? What if it can't be done? (e.g. a user might later log in?).
> And against WHAT are you going to check?
> Do you believe that a decidable ML-like (or lesser) type system can
> really express all the required consistency and security constraints?

Not for real low level stuff like this... ideally, authors would PGP 
sign them with enclosed suicide pact guarantees, but that's kinda 
discourage programmers! It's still undecided, and so far I expect 
that some level of blind, unthinking, trust will be needed. Hell, 
people trust several megs of Windows NT as a mission critical 
server...

> > Hmmm. But when you see a spreadsheet icon floating in 3D in your
> > hallucinogenic datalink and you think, "Let me see that!" surely it's
> > logical to assume that you'd want to view it as a spreadsheet rather
> > than see the full range of options (unless you specifically ask for
> > them):
>
> Sure! Because the spreadsheet icon points to
> *a spreadsheet view of the object*,
> not to an explicit view multiplexer of it,
> less even to "it" directly.
> The spreadsheet view
> could still give indirect access to such multiplexer, of course,
> and would of course give indirect access to "it"!

Having seperate views of the same thing lying around tended to 
confuse some users in trials of OO interface paradigms; I try to 
avoid that unless it's specifically asked for. I mean, you /can/ have 
a link to an entity in more than one container entity if you want, 
but it's not commonly done like in Unix. And when you refer to an 
entity, you refer /to that entity/. Full stop. Tada.
You can be interacting with more than one view of it if you /ask/, 
though.

> >> * Your 24bit text representation is ridiculous!
> >>  you're having a gratuitously non-standard solution for representing text.
> > It can be lossily converted to lesser types of text when a gateway is
> > interacting with lesser software :-)

> To lesser types, perhaps, but if I want a lesser type,

Why should you want a lesser type other than in the gateway example?

> why force me to use your inefficient kludge?

:-(

> And the problem is that most of time, I don't want a *lesser*
> type, but a completely different type:
> you're 24-bit format is requiring lots of extra information
> I just don't care about while not supporting lots of information
> I *do* care about.

But writing style is just as much a property of a character as which 
character it is. When we're using good old fashioned pens, we can 
push hard, underline ANYTHING we have written, change colours, etc. 
If we're naming card indexes, we can choose to use red pen for 
important entries. Software card indexes these days think they're 
clever if they supply an option to colour code entries...

And I can do cool things like have parts of entity names in bold etc.

"Alaric's *important* work"
"Alaric's /secret/ letters"
"Alaric's _wierd_ jokes"

etc...

> Plus I know just NO COMPUTER
> where 24-bit is remotely efficient to manipulate.

It can live happily in 32 bits! The reason for not just using the 
"spare" eight bits is that it saves a byte on each character... 
important when you've got a copy of the bible, even though text is a 
natural candidate for storage compression.
 
> >>  You seriously limit the kinds of possible "character attributes"
> >>  combinations to fit 256 possibilities,
> > That sounds like a lot. Heck, once 640k sounded a lot. Have 32 bit
> > characters, then! 65536 styles.
> Still buggy.
> My character attributes would include font info,
> stretching info, color effect info (say for a rainbow-like colored letter),
> etc, etc. And my needs will evolve. Your hardwired attributes just can't.

They can quite happily map to complex speech-synthesis intonation 
maps if they need to!

Style codes aren't things like font, size, colour, bold, etc.
That's a typeface.

Style codes are what the name says - run Microsoft Word (if you dare) 
and look at the lists of styles in each template. They're things 
like:

Body text
Quotation
Reference
Emphasis

Extract the ones that fall outside CHAR's responsibility - ie, the 
paragraph styles; paragraph formatting is the next layer of 
organisation up from strings of CHARs - and you'll find that there 
aren't all that many.

And the user can choose what each one looks like in ARGON.

> Or will you dynamically allocate them?
> And have horrible attribute-maps much like color-maps in pictures? OUCH!

Ooh, no!

> DON T HARDWIRE ANYTHING IN THE OS STOP

Calm down! When the 24 bit CHAR becomes obsolete, a 128 bit
(backwardly compatible) replacement with full support for alien
characters from all of the 9 billion inhabited worlds and
thought-wave emotion styles can be produced! It's not "hardwired" as
much as it may seem.


> >> * Your don't detail your type system. I suppose a ML-like type system?
> > For objects, it's multiply inherited static classes. For entities,
> > it's prototyped dynamic typing.

> Please detail your type system.

It's still a little woolly at the moment. However, a post on 
c.l.functional about a Java replacement called Pizza caught my eye as 
being very similar to what I'm working towards (reinventing the wheel 
- again :-(, so take a look at: http://wwwipd.ira.uka.de/~pizza

I was quite pleased to hear on c.l.f such rave reviews as:

------8<---------
As someone else mentioned, I think Phil Wadler and Martin Odersky's
idea of adding useful FP features to Java is the right way to do
things.  Would that Pizza also had a Lispy syntax.
------8<---------

> Do you have polymorphism?

Yes!

> modules?

Indeed; i've answered this one elsewhere.

> Functors?

And your definition of functor is....?

> Do you have multimethods?

I'm not sure yet. I haven't found a document describing their pros, 
cons, implementations, uses, etc. 

> What in your language is statically or dynamically bound?

Functions, of course? (I don't think I completely understood that 
question :-(

> How do you manage covariance and contravariance
> of polymorphic function types
> with respect to inheritance in result and/or parameters?

This is still under investigation. Personally, I only feel happy with 
avariance, but I guess variance is more flexible (just I get a shiver 
up my spine that it'll let some nasty bug slip through...)

> Do you differentiate inheritance and subtyping?

Not currently, but I mean to do more research into seperated 
inheritance of interface and implementation.

> >>  What kind of modules will you have?
> > Some entities exist purely to have functionality inherited from them.
> > For example, the "GUI client" entity will contain classes to
> > implement a GUI interface. This is easy for the programmer, and easy
> > to explain to the user ("That little icon in the Resources folder
> > tells all the other little icons how to appear on screen!")

> Do you really believe that inheritance is expressive enough
> for everyone's needs [see the Tunes glossary entry about it]?

I've no problem with that form of it. I'll read your glossary, 
though, and see if I can be convinced otherwise. It'll be a bummer if 
I can, coz I'll have to rethink a lot of stuff :-(

> That still doesn't answer my question.
> What kind of module system will you have, if any?

That /is/ the module system! Entities are pretty flexible in that 
respect.

> >>  Will ARGOT be reflective?
> >>  What impact on the type system will that have?
> >
> > Exactly which definition of reflective are we talking about here?
> > I've heard two: that of systems like FORTH that described themselves
> > in terms of themselves, and those like CLOS that use metaobject
> > protocols. Well, ARGON proper is more like the latter, and ARGOT is
> > more like the former...

> Both are the same: a metaobject protocol is the way through
> which a language can describe itself.
> And that *does* have strong impacts on the type system.
> Lisp and Scheme solved that by having
> a trivial strong type system with a one huge union type.
> Napier did it by having some kind of source-string based reflectivity,
> instead of a fine-grained syntax-tree based one.
> BETA seems to do things ok, by supporting full syntax trees,
> but I'm not sure exactly how they do things...

Well, the compiler is well seperated from the stuff being compiled, 
for evident security reasons. However, the language is a bit like 
SELF in structure. Everything is calls to methods in objects, or 
evaluations of functions from objects... this makes for a theoretical 
infinite loop, since that alone cannot describe anything; the 
compiler secretly breaks the loop and inserts inline code to do 
meaningful things. OTOH, anyone can write a class that implements the 
behaviour of an integer. They can define a parametric type 
constructor:

U[x]

and so on.

> >> * There is no semantical between your "DO" and your "SEQ"!
>                  ^^^^^^^^^^

> Nope. You're confusing semantics and implementation details.
> That's definitely WRONG.

> If your code is such that order doesn't matter,
> then SEQ and PAR are equivalent to it,
> and you shouldn't care about that,
> because the compiler should be able to choose
> whatever combination fits most the available runtime resources.

Nononono! SEQ and PAR would both /work/ instead of DO, but they give 
extra information that DO doesn't. SEQ says "Do these in /this/ 
order", PAR says "Do these things as many at a time as possible". DO 
just says "Do these things as you see fit". PAR is not about saying 
"arbitary order" as about saying "in parallel". Some things /need/ to 
be parallelised. EG:

PAR
   Start server process
   Start test client process
END

The two had better run simultaneously... that's the difference 
between DO and PAR; DO of the above two might well not parallelise 
them, especially on a platform where parallelism is expensive. I've 
already pointed out the difference between SEQ and DO in the last 
example; the difference between SEQ and PAR is mentioned, although 
not demonstrated; and I've just shown a difference between PAR and 
DO. You can't say they're the same!

> >> Finally, what in your project requires that this be implemented as the OS?
> > Nothing!
> Well, then first do try to prove your concepts on top of an existing OS!

It's certainly easier.
 
> >> Couldn't you just build it over Linux?
> > Yup... but for the reasons for and against that, see your own OTOP
> > subproject :-)
> Whose conclusion is FOR, in the first time...

Exactly.
 
> [difficulties downloading VSTa]
> Use a mirror, then (sunsite.doc.ic.ac.uk near you does mirror it)!
> Or have a background download done at your ISP's
> then just download from ISP to home!

Yeah, I guess I'd best try again (yawn) :-)

> >> Are [existing OS *kernels] inadapted to what you want in any sense???
> > Just a little overkillish in places.
> > I'd be inclined to start with
> > something like VSTa rather than start from scratch, though.
> You could also start from L3, L4, the Flux OS Kit, and/or whatever.
> See the Tunes OS Review page and pointed pages for that...

Actually, I've looked at 'em all anyway :-)


ABW
--
Governments are merely protection rackets with good images.

Alaric B. Williams Internet : alaric@abwillms.demon.co.uk
http://www.abwillms.demon.co.uk/