More On Multiple Inheritance
Hans-Dieter Dreier
Ursula.Dreier@ruhr-uni-bochum.de
Sun, 14 Feb 1999 03:26:49 +0100
More On Multiple Inheritance
============================
I mention this issue again because it affects memory layout,
which IMO is most important for overall design.
I still cannot see a "true MI" implementation
that I feel comfortable with.
As you know, the problem is "shared base classes".
First let me define how I understand the term "MI" here:
It means that you declare multiple base classes which are
somehow "equal" in the sense that none of the base classes
is special compared to the others.
It means also that you do not have to take special steps
in the base classes to make them MI ready.
This distinguishes "true MI" (shorthand: "tMI") from
MI emulation using "contains" relationships ("cMI").
To further clarify things, let me sketch an example of
a shared base class:
(I apologize if I'm repeating the obvious):
// tMI:
class A {int a;};
class B: A {int b;};
class C: A {int c;};
class D: A {int d;}; // needed for discussion below
class E: B, C, D {int e;}; // Problem referencing x
// cMI:
class A {
int a;
};
class B {
int b;
A& ab; // contains an A
// default constructor for standalone use of B
B(): ab (new A) {};
// constructor for MI use of B
B(A& a): ab (a) {};
};
class C {
int c;
A& ac; // contains an A
// ... (same constructors as in class B, except for the name ac)
};
class D {
int d;
// ... (same constructors as in class B, except for the name ad)
};
class E {int e;
B& eb; // contains a B
C& ec; // contains a C
D& ed; // contains a D
A& ea; // also contains an A
// default constructor for standalone use of E,
// sharing a common instance of A in B, C, D
E(): ea (new A), eb (da), ec (da), ed (da) {};
};
This is quite kludgey, but it should be possible to achieve
the same results as with MI, though use of the common base class
instance is more complicated since you got to name the reference
explicitly.
For me, the emulated version has the following advantages:
1. It is perfectly clear what happens, especially the fact
that the base class instance is really shared.
IMO this is better than an implicit assumption.
Maybe sometimes you don't want sharing; how do you specify
which case you want, using tMI?
2. Implementations of the base classes are completely
independent from derived classes. They all can use the
same memory layout for their instances. For tMI,
this might not be the case (see discussion below).
If we do agree here (at least in principle),
let's discuss how MI could be implemented.
Storing "on both sides of the pointer":
---------------------------------------
As I understand this, it could result in a memory
layout like this:
c
"this"->a
b
d
e
The memory layout of D has a gap (b)
because it was actually "triple inheritance"
(at least till the advent of n-dimensional memory ;).
If you just needed "double inheritance",
it could have been done without a gap.
That in itself could be regarded as a minor nuisance;
maybe more-than-double inheritance is not used very
often. What IMO is more important is the fact
that the implementations of C and D must know
about the existence of sibling classes
(D must also know the memory layout of B) and
their children (you would like to optimize the
whole issue away in case there is no class E).
I don't like this because it is sure to make
compiler construction more complicated and
incremental compilation less efficient.
And what shall we do if a recompilation cannot
be done (say, only the binary is present)?
There are two other points in this approach
that I don't like:
P1. "Downcasting" may change the address
of the "this"-pointer. You get this situation
if you add some other (unrelated) base class
to the example above.
This means that once you have downcasted
(even implicitly), you can no longer access
the original object, using the modified pointer.
I can't quite explain why, but I don't feel
comfortable with this implication.
Maybe this breaks covariance. It certainly
breaks this "classOf" operator I proposed
in some former posting, which was supposed
to fetch the original (i.e. non-casted)
class of some object.
P2. Having a pointer not point to the start
of an object makes memory management more
complicated. In fact, it breaks a central
assumption that I deem neccessary for easy
and efficient memory management:
That the start of the memory block can be
determined if only a reference to the object
is known.
Storing on one side of the pointer:
-----------------------------------
The memory layout would look like this:
"this"->a
b
c
d
e
This approach does not have problems P1 and P2
mentioned above, at the expense of having bigger
gaps inside the memory layout.
The sibling dependency problem is still present.
Internally using the "cMI" approach:
------------------------------------
This comes in two versions:
a) This approach is used generally
(even for SI).
b) This approach is used only when MI is present.
If a) applies, the sibling dependency problem
mentioned above is not present. There are no gaps,
and memory layout depends only on the class
itself (not even on base classes, which makes
compilation easier).
There is a memory penalty as well as a execution
time penalty because pointers must be used
to access base classes' instance items. Because
there are more objects, there also are more object
headers, adding more memory overhead and slowing
down GC (if I have my way of allocating storage
object-wise).
There is an additional overhead to initialize
these pointers when an instance is constructed.
Because "downcasting" implies changing the "this"
pointer, problem P1 mentioned above is present.
Other approaches for tMI:
-------------------------
I'd be glad to learn about them.
Conclusion:
-----------
IMO, MI is not really *needed*.
The same effect can be achieved using
cMI as sketched above.
The question should be how to design the
language to implement cMI with minimum effort
on the class designer's side, and how to
minimize impacts on the source code when adding
or removing MI from a class.
I already sketched a possible way in a former posting:
Introduction of "transparent" names.
There would be keyword that can be added to
declarations of name spaces, classes and other items:
class AClass {
...
transparent BClass;
...
}
Note that this would actually declare a reference
to a BClass instance. IMO there should be no way
to declare an instance item to be *included* into
another class.
It would affect only the way parsing is
performed: When a name has not been found in AClass,
the namespace of BClass would be searched before
progressing to the name space enclosing AClass.
No qualification would be needed unless a name clash
occurs. This way, the use of cMI'ed base classes
would feel like tMI.
The other issue is the initialisation of the
contained items' references. IMO this should not be
fully automatic; at least the value to be used should
be mentioned somewhere. It could look like this:
...
transparent BClass = new BClass;
transparent CClass = BClass;
...
I would not like putting it into AClass's constructor
as it is done in C++ because this spreads information
to different places rather than concentrating it where
it belongs.
What does everyone think of this ?
Hans-Dieter Dreier