On object layout and inheritance

Ursula Dreier Ursula.Dreier@ruhr-uni-bochum.de
Thu, 03 Dec 1998 02:03:11 +0100


When I first thought about memory layout, I saw a need for structuring
memory into objects of uniform structure. Uniform in a sense that their
contents had to be interpretable by GC in order to find references to
other objects while not interpreting binary stuff  (numbers, strings,
...) as references.

It is likely that there will be a huge number of objects in memory at
any time. Therefore, object layout had to be as simple and the fixed
object header as small as possible. I admit that I ruled out the option
of objects physically containing other objects right from the start
without really considering it seriously. Maybe this was a little bit
quick, but I still see a lot of difficulties with such a nested
approach:

For this discussion, let's call the non-recursive memory layout "flat
layout", and the recursive one "nested".

GC and maybe other aspects of memory management have to access all
objects regardless whether they are dead or alive, without navigating
along the references. So objects need to be linked together in some
form. In flat layout, this can be a linked list, which is easy to
traverse and maintain and carries little memory overhead. In nested
layout, it has to be a tree and is comparably complicated to traverse
and maintain and carries double the overhead of a list.
GC is more complicated for nested layout than for flat because you can't
simply free an object without examining its contents first. Growing and
shrinking of objects is even more complicated. So nested layout really
would have to have a BIG advantage somewhere else to to be competitive,
which I simply cannot see right now.

So I ended up with the flat layout and its consequence: No embedded
objects, no references to parts of objects.

Next thing I want to discuss is single vs. multiple inheritance (MI).

When I first thought about how to implement a sensible class system, I
felt a dilemma:
On one hand, MI seems to be a really nice thing: Implement different
independent features of our classes each in its own class and combine
them in an orthogonal way. Whow, what a great opportunity to avoid
redundant code!

But there are problems: Each reference to an instance item (a "member
variable" in C++ speak) would usually be compiled to acceed at some
fixed offset in the list containing that instance's instance item
references.
To derive a subclass would simply mean to have the instance items of the
subclass appended to the end of the list representing the instance of
the baseclass. Quite simple.

But that wouldn't be possible for MI, as you can easily see. To make the
issue even more complicated, there is the "common anchestor" problem
built into the concept of MI:

class Person {};
class Employed : public Person {};
class Customer : public Person {};
class EmployedCustomer : public Employed, Customer {};

What about the instance of Person that is contained twice in
EmployedCustomer ?
Usually you wouldn't want to have two independent instances of Person in
EmployedCustomer.

Moreover, if you got a lot of these combinations, writing all of them
down isn't much fun and quite unflexible to extend.

A "rank" regarding subclasses could only be derived from the ordering
of  the baseclasses in the anchestor list of the subclass, which is sort
of a "weak" ordering (not very obvious to the user).
If you treat them as having "equal rights", you got problems with
identical identifiers occuring in those base classes.

I think there are alternatives to multiple inheritance which haven't got
these problems and are clearer to use and easier to implement as well:

1.
class Person {};
class Employed :{ Person* p};
class Customer :{ Person* p};
class EmployedCustomer : public Person {
Employed* e;
Customer* c;
Person () {
 e->p = (Person*) this;
 c->p = (Person*) this;
 }
};

This works, but is a little bit complicated if not to say clumsy -
actually I didn't feel quite at ease with this solution.

A more elegant solution is to have parameterized classes:

class Person {};
class Employed (class Person pE) {};
class Customer (class Person pC) {};

This is no valid C++ syntax, but maybe you see what I mean: It's very
similar to templates, but is implemented statically rather than
dynamically, avoiding C++ problems during template instantiation. It
also has static type checking (the parameter of Employed and Customer
must be a Person or derived from that), which C++ lacks (as far as I
know).

Just declare a Customer Employed Person or an Employed Customer Person,
and even an uncertainty regarding precedence of Employed vs. Customer
methods cannot occur: Whatever comes first has precedence, because it's
actually a Customer that you declare, which happend to have an Employee
as an instance item. You got all the possible combinations for free
without having to declare a class for each of them.

One point that I didn't like in this approach was that you would have to
qualify items in the underlying Person in a way depends on the actual
parametrisation. If you declared a Customer Person,
would access the underlying Person's age by pC.Age, if you declared an
Employed Customer Person, you would have to do it by pE.pC.Age.
I think this problem (and some others, too) can be overcome if we
introduce a "modifier" for declarations, which I would like to name
"transparent". It would be allowed for every declaration of something
that can be regarded as a name space (especially: classes). The
semantics would be that if a name cannot be found in the name space
currently being looked at, the name space referenced by the transparent
name would be included in the search, before search goes on in the
parent name space. In effect, that transparent name would be just that:
transparent. You could omit the name and the qualification dot; the
contents of the name space referenced would effectively seem to be
appended to the name space containing the transparent name. It would not
affect class structure or object layout, however, being just a tag
attached to a name, which modifies the behaviour of the compiler, like
"public" or "private" in C++.

Combine this with the possibility of having type expessions (discussed
in "On classes in declarations"), and in my opinion you got all you
could possibly get from MI, but with less pain.

Hans-Dieter