On object layout

Matthew Tuck matty@box.net.au
Tue, 08 Dec 1998 23:19:08 +1030


Hans-Dieter Dreier wrote:

>> If you meant it to be an optimisation to be added later on: That's OK
>> with me.
> Won't be easy, though.

Sure won't - but I think this will need to be done to enable a lot of
standard optimisations which operate on the premise of value semantics,
if everything has pointer semantics.

> Sure it's more a matter of memory management, but the two are working
> together.If you just access them by method calls, it could be done
> right from the start. Because you couldn't have references to them at
> the language level, those methods would have to be built-in ones.
> Actually, I think there should be a built-in VM call which has two
> parameters: An object reference and an offset, and which would get/set
> the reference (or integer, or other inline object specific to that VM
> call) found at this offset. But that would be for internal use of the
> compiler only! Otherwise it would seriously compromise safety.

> >  I haven't but I
> > gather though they might be simple, they're not necessarily fast.
> > Nothing like the speed of a stack.  And GC is not quite so simple if
> > you're creating an incremental collector.
> 
> There's a pretty simple scheme involving blank, grey and white tags
> mentioned in your GC reference, which is derived from the standard
> mark/sweep algorithm; it can be further enhanced if it keeps a list of
> the "grey" objects so they can be found faster than by searching the
> memory sequentially. I imagine that should be pretty easy and pretty
> fast.

I think we should have two garbage collectors - a non-incremental
(possibly mark-sweep) for non-interactive applications, which should be
faster.  Our compiler would be one use.  The other is an incremental
collector for interactive applications.

> Additionally, we could use that 1-bit use count also mentioned in the
> GC reference. This might perhaps speed up reclaiming stack objects.

You could do a 1-bit reference count providing you have a backup full GC
to catch whatever falls through, but I think reference counting is not
incremental GC as you can get long pauses for big data structures.

> It can be a base class of an user-accessible stack class.

What would this stack class do?
 
> I'm talking of value semantics, then. I don't like pointer semantics as
> far as simple objects (numbers, strings) are involved, because it's
> exactly what a normal programmer wouldn't expect.

Well I wouldn't call "because that's the way we've always done it" to be
a good argument.  What I think is that, although providing value
semantics (or the essentially equivalent pointer to constant data) is
ok, you should be able to provide all semantics with all types in an
orthogonal way.  Often it is just value for basic types and pointer for
other objects.

> This is just a proposal, but I think it combines the benefits of both
> appraoches while imposing a comparatively low overhead.

I don't think it would have low overhead - as far as I can see what
you're proposing means you have to copy the object everytime you make a
change - this is what causes functional languages to be inefficient.

> Sure, we should try. But don't expect too much: If your program uses a
> GUI, that will use up half of the time anyhow, and that limits what you
> can get. Even if your program runs in zero time, the overall
> improvement will be just 50%: Barely noticeable. And he who puts in
> more memory and a slightly faster CPU into his machine will have the
> same improvement.

Are you talking here about being I/O or using someone else's GUI code?

>> OK, well strings are arrays, and maybe they would be common.
> I'd rather say strings are simple objects (at least at the language
> level). But never mind...

Well, ok, I fell into a trap here.  My terminology is as follows. 
Strings are sequences of characters.  Arrays are implementations of the
sequence type.

This allows you to declare normal string operations such as
concatenation, substringing, pattern matching, etc. for any sequence,
such as sequences of numbers, since the code is parameterised on
sequence element.  You can't do that with simple strings.  I don't know
of any advantages of simple strings.

> I got something special in mind concerning the last item in an object:
> It is the natural place to put an arbitrary number of items of equal
> length(usually, but not limited to, references) into an object. The
> drawback is, of course, that can only have one of those in your class,
> and you cannot easily derive subclasses with instance variables from
> that class.

Yes, that's what I meant by dynamic arrays.  It's true that you can't
derive from it (unless you go on the other side =) ), so it might only
be OK for leaf classes.

-- 
     Matthew Tuck - Software Developer & All-Round Nice Guy
                              ***
       Check out the Ultra programming language project!
              http://www.box.net.au/~matty/ultra/