More On Multiple Inheritance

Hans-Dieter Dreier Ursula.Dreier@ruhr-uni-bochum.de
Thu, 18 Mar 1999 22:11:59 +0100



Matthew Tuck schrieb:

> Hans-Dieter.Dreier@materna.de wrote:
>
> > Types correspond to the interface and classes correspond to the
> > implementation?  That would mean that all objects of a given type must
> > conform to the same interface, right?
>
> Well there are no "classes", just initial objects that are cloned, but
> yes, all objects can still be made to conform to a typing hierachy.

The idea of having no classes appeals to me,  like everything that reduces the
number of concepts used. There is one question however:

How would we distinguish between instance items and class items, ie how can
the template object determine whether it wants some item to be inherited or
replaced by the clone's? Usually one would want all the clones to behave the
same way (either replace or inherit a particular item).

Additionally, I'm wondering whether there might be situations where one would
want to distinguish between "X the type" and "X the object". Like situations
where a parameter is of type "class" as in most parameterized types If there
are no more classes, there can hardly be a parameter of that type. I'm not
sure whether that would pose a real problem however. What would you think?

> >> Agreed.  It would be interesting to be able to play around with things
> >> at run-time using the environment, but things like changing types have
> >> to be handled properly.
> > If the editor and the source are both objects, with one of them acting
> > on the other, anytime is runtime for some part of the system. For some
> > objects a compiled version exists (and thus they are executable).
> > Others consist only of source code and a notice that says "I need to be
> > compiled before use". Still other objects cannot be compiled (a stack,
> > for example).
>
> What would you say the ramifications of this point of view are?

Generally speaking, a more interactive environment. See the GUI builder thread
for an example involving the editor. The compiler would benefit too. It can
easily evaluate expressions that rely on other (already compiled) parts of the
system. There is no link time in the usual sense. Instead, work spaces
(containing source as well as compiled code) are merged and parts of them get
recompiled as needed. Besides enhanced flexibility, the developer will notice
increased speed. An example: If I compile an application written in Centura on
my system at home (100 MHz Pentium, 32 MB core) it takes 10 seconds to produce
a 400 KB executable from the source file. If I do a similar thing on my PC at
work (VC++ on 350 MHz PII with 64 MB core) it takes 2 minutes to produce a 200
KB DLL. If you do that dozens of times a day, it adds up.

> <snip - separate message>
>
> >> This is one reason why I try not use the word class anymore except when
> >> using the traditional connotation of a unified type and implementation.
> >> It can confuse the situation.  Some people and languages have tried to
> >> use class to refer to impl only, but I don't think that's a good idea,
> >> since it's confusing.
>
> I admit this criticism can be levelled against 'type' too, since it was
> used to represent combined type/impls in non-OOPLs.  I think that the
> word better represents the idea of an interface than class, plus it is
> less entrenched now OOPs are common.  Concepts like "strong typing",
> "type systems", etc. really only refer to the interface.  But it is a
> bit arbitrary.

Well sure - but "class" is so much more concise than "interface" or
"implementation". It's a pity. Let's use "type" and "impl" if we feel the urge
to stress the difference, agreed?

> > Good idea. Maybe you want to give your _definition_ of type, impl and
> > signature, especially where one begins and the other ends.
>
> Well a type defines the interface.  Signature is a word I usually use
> for methods but I'd say it's the same as interface.  Perhaps the only
> different between type and interface/signature is that type gives an
> interface and name, and if you have name typing rather than structural
> typing, two interfaces can be the same yet have a different type.
>
> An implementation is the code that obeys a type's interface.  They are
> semantically and syntactically different entities.

Where do you see a syntactical difference? Why *can't* both be mixed? Maybe
they *shouldn't* be mixed, but that is a matter of style. And if they can't be
mixed, how do we avoid the overhead of having two entities describing the same
thing. Some programmers wouldn't like to first define the interface, then a
separate impl. Remember, there are lots of cases where this simply is
overkill: Everything that it is intended to be derived from does not need an
interface.

> > For
> > example, if the compiler detects some property of a function (e.g.
> > "has no side effects"), but the user did not _demand_ that, does this
> > belong to type?
>
> Essentially an impl will always have the same interface as it's type,
> there's no point it having more since it can never be used.  The type
> embodies all the interface definition.

I don't agree completely here. I'd rather say "since it can never be used as
far as the outside world is concerned". But it might be useful to know about
that property for impl internal use.

I also think we should introduce a notion for the information a compiler needs
to actually produce code - namely, is it a reference type, is it a binary
type, and if so, how big is it; is it a function, and if, what are its
parameters and return  value. The code generator simply needs that. I wouldn't
delegate code generation to the linker (especially if we can do without a
linker which I believe), and then the linker would need that information too.

> Deriving things like no side effects would require looking at a piece of
> code in an impl.  For a start there might be more than one impl to look
> at.  Not an insurmountable problem.

No problem, actually, since each impl must be compiled before use. It is
checked then. Because it inherits from the type, it gets flagged as "need to
be compiled" if the type changes significantly (i.e. visibly to the outside
world).

> But then you might want to subtype the type.  If it was typed as
> side-effectless, a subtype would be forced to obey this, which might not
> be desired.

You'd have to change inheritance relationships then. Subtypes cannot have
weaker postconditions. But I see your point. How about having the compiler
flag such a condition for information purposes (as a comment), and letting the
user choose whether he wants to export the condition (ie activate it for
inheritance purposes).

> ...

> Which brings up the question about possible access control in the
> editor.  This would probably be on a per module basis - every module
> would have certain people allowed to change it.  Hence if you weren't
> alllowed to change the type, the solution above would appear on the menu
> greyed out.

I'd leave that to a versioning tool which can handle it on a per module basis.
A reasonable versioning tool needs a lot of work to set up properly, most
likely too much for the ordinary programmer.

> > If one implementation uses a get call where the other
> > allows direct (read) access, do they have different type? Or different
> > signature?
>
> Rather than working from impl to type I work the other way around (since
> impls can be added to the system), but the basic answer is the are the
> same.
>
> If the type dictates read-write the impl must provide a get/set or a
> var.  Or just a get or a var for a read-only, and similar for
> write-only.
>
> Since whether the impl uses get/sets or a variable is transparent to the
> client, it is not a part of the type, since the type constrains subtypes
> and implementations to its interface.

Yet it needs different code. It's still not clear to me how you would handle
this efficiently without knowing about the impl.

> >> Well if you want to define non-overridable methods then yeah, you can,
> >> but this defeats the spirit of OOP.  You should always have the
> >> opportunity to override.  I have never come across a use of non virtual
> >> methods other than as a kludge to avoid dynamic dispatch.  Flexibility,
> >> runtime-efficiency, compile-time efficiency, choose two.
> > To allow non-overrideables comes really cheap and has no adverse
> > performance impact. And it enhances expressiveness by adding another
> > orthogonal facility (whether that is needed often is another question).
> > So why not implement it?
>
> Well I'll explain a bit more about that in a second, but I will point
> out now that there are actually some simple static dispatch situations.
> If you can find a type A with only one impl z then you can always static
> dispatch to it.  This includes allowing an unlimited number of subtypes
> of A as long as for A and its subtypes there's only one impl that
> exists.  This might be an impl of A or any of its subtypes.
>
> Admittedly, again this is usually an inter-class optimisation thing
> since you can't guarantee there won't be a subtype/impl unless you know
> all the code that can be loaded.  And runtime loading/generation makes
> it even harder.

Yeah. This is exactly why I don't like it.

> > If the class author wants to make sure that some service is called
> > from a particular method, how can he enforce that otherwise? Even if
> > the overriding function sticks to the contract, the internal
> > _semantics_ cannot be checked by the compiler in such detail.
>
> C-style non-virtual does not enforce this - if you convert the pointer
> to a subtype and dispatch you still get different code.

Sure but then you explicitly demanded that. And the subtype must have be
visible to the compiler (and therefore to the programmer as well), so he knows
what he is doing. I don't want to prevent the class author from using
subtypes' methods, rather I want to be able to prevent a subtype's author from
changing the base class's impl.

>  What's the deal
> with that?  The only thing that can do this is a Java-style final which
> does prevent overrides.  But this has its own problems.

Can you name some of these problems?

> >> Yes but normally the B will physically contain an a, hence no matter
> >> whether you have an A or B, you can look inside the object to find the
> >> a.  With composition, you don't know this.  If it's impl A, you can,
> >> if impl B, you have to follow a pointer.  Then problem is that you're
> >> no longer extending the layout.
> > IMO B contains an A rather than an a (which is an int, BTW, we must
> > not confuse types and names here). That's what the user specifies
> > using cMI, and that's what he gets. If the member ab (which we're
> > talking about) were specified as transparent, he could omit the
> > qualification "ab.". At the source level it would look like tMI and
> > the compiler would take care of the derefencing (from the
> > qualification) automagically. The code would (maybe) be different,
> > yes, but who cares?
>
> I understand the names could be transparent, when I was referring to the
> _generated code_ rather than the source.  The generated source would no
> longer involve a method dispatch, but would have to involve an if
> statement that looked at the type tag to determine the layout and where
> to go.

I must admit I don't understand this. Maybe because I'm assuming that the
compiler knows about the implementation of all the base classes? Please
explain.

> If classes were added to the system at runtime, this wouldn't work
> anymore  (although, come to think of it, neither might MI layout
> algorithms ...)

Though I don't understand the details you're talking about (see above), this
sounds as if there could be a serious problem...

> >> You could generate if/then code for each access, but that's going to
> >> cause very bloated code.
> >> If you had get/sets for all fields, you could direct it to the correct
> >> place, but this introduces a dynamic dispatch where there previously
> >> would not have been one.
> > I don't see this.
>
> Again, at the generated code level.
>
> > Well there's the old (maybe far-fetched) example of the superclass
> > that can't be recompiled. Maybe we choose to put stuff (superclasses)
> > into DLLs. Those DLLs would be shared by many applications. Do you
> > really want to have different DLLs with different instance layouts
> > because some users decided to change their code?
>
> Yes that's true.  Under this system you would essentially need the app
> to recompile to work with a new version of the DLL with new layout.
> However I think this wouldn't be much different to how it now works with
> (say) VBRUN.  They probably wouldn't allow changing type layouts either!

The problem is that instance layouts (I assume your "type layout" is the same
thing) can get changed by simply adding some overridden method. IMO the
problem is not the interface; it's the impl.

> So this being said, knowing we need some sort of fixed layout, if we
> wanted to improve the situation without a recompile of the DLL client we
> would probably need to put in a level of indirection like the offset
> table I proposed.

That could work but look at the performance hit... In effect you are doing
away with static dispatch. No optimisation in the world could offset this.

> > Maybe this seems far-fetched, but if I look at the stuff I', working
> > with at the moment (Microsoft's ATL mixed with STL and own classes),
> > I can tell you: Inheritance relationships there are a nightmare. Many
> > classes have lots of base classes, all of them templatized of course
> > (and mostly pure virtual).
>
> How would you improve the inheritance hierachy?  What sort of
> superfluous parent classes exist?  I've always found using a few mixins
> makes things very flexible if a little harder to visualise.

A few mixins may be OK, but a dozen? If a class structure is too fine grained,
that does not neccessarily improve its understandability.

> >> I don't think the pointer would change, just the typing aspects.  If
> >> you have a fixed layout the pointer wouldn't need to change.
> > I wanted to know for sure, so I tried. VC++ 6.0 definitely changes the
> > pointer if you cast to a base class:
>
> Interesting.  C++ has no GC usually, so that problem does not apply to
> it, although what if you tried to deallocate the B* ?.

I don't know (and I'm not sure whether I really wnat to know...). Most likely
it falls into pieces.And where does GC affect this? Do you mean because the
pointer isn't pointing to the start of the object any more?
Well you simply write &(object.member) and you always get a pointer pointing
somewhere inside an object.

> And it looks like it doesn't care about object identity comparisons,
> which coming from C, does not suprise me.

In C, identity comparisons are not guaranteed to work if you compare pointers
to different classes. There also is the alignment issue: If you cast a (char*)
to a (int*) you might end up with a different pointer value because int's are
aligned to 4-byte boundaries while char's are "aligned" to 1-byte boundaries.

> > How could it be otherwise? If you can show an example, I'd be very
> > interested to learn about it.
>
> Well I was referring to algorithms where the layout is fixed at compile
> time.  Basically each mixin gets a fixed position it always sits at, so
> you can find the mixin from an object pointer no matter its type.  You
> possibly get holes, which could be rather big.
>
> A layout table for each mixin for each impl might work to avoid this,
> similar to a dispatch table.  In the same way, obvious efficiency
> disadvantages apply, but can be optimised.

How would you deal with non-repeated inheritance? There you got *one* mixin
type at *two* different offsets. This case isn't as rare as one might think.
Just take ordinary MI where both base classes are derived from "object" and
you got it.

> > Maybe it also depends on personal style. If I write a class without
> > "virtual", I can be pretty sure that noone unintentionally breaks my
> > code by overridding with an inappropriate replacement. It's sort of
> > encapsulation.
>
> This is purely due to the absence of multiple implementations.  Hence
> you can't write another implementation which uses delegation objects
> instead and gets around those problems.  So it's kind of encapsulation
> through language strangulation.

Different impls may behave differently in this respect. Where one may be
immune to overriding, another might encounter a problem. So from a practical
point of view I wouldn't generally prevent overriding for the *type* but
rather for a particular *impl*. Now see how this can be reconciled ...

> But anyway, one option would be to put a "final" on types (and you can
> only have one impl), which would avoid getting around it via
> delegation.  This is essentially moving the Java solution into the
> multi-impl world.  However, I don't like this at all, I've encountered
> final classes I really want to override before.

Putting "final" on a type may really be strangulation. But I don't demand
this. I just want to make sure that *some* of my methods don't get overridden
if called from code within *my* class. What the subtype designer does in his
class is not my business, I won't interfere. He may override as often as he
wishes, but he should please not affect my code.

> Often what the author really wants is to restrict overriding of their
> implementation.  Hence they won't care if you write a whole new
> implementation.

That really would break the spirit of OOP. No one would want to restrict
overriding as long as it's *safe*. That's the point.

> Other sorts of override modifiers are possible as well - prefix only,
> suffix only, surround only, etc.  CLOS does something like this with its
> overriding.

This would be useful for message handlers. Often they simply add their own
code to the standard impl.

> Can you think of any examples where you've wanted to use final?

Not really, but maybe that's just because I haven't written many classes that
are intended to be used by others. To a great extent I'm feeling uncomfortable
with your opinion because it reduces (if not to say destroys) locality of
code. If you can't tell any longer how something might work by looking at it
and the parts it expressly uses because you can't prevent some unknown client
to change its operation (for himself and his clients) in some arbitrary
manner, you'd have to consider at lot more code to find bugs. I consider that
a huge drawback.

> The situation I've discovered is you have certain methods, but you only
> want to call them in a certain way, order, etc, so you make the driver
> method final.
>
> One language enhancement that would therefore be quite useful would be
> the ability to override a method, but not to be able to call it.
> Abstract methods could be filled in quite fine by the subimpl, but the
> only way you could get to run those methods would be through actually
> calling the driver method.

Can you give an example? I think I understand what you mean, but I can't think
of a situation where I would want to prevent the programmer from calling his
own methods.
BTW. a user could easily bypass this restriction if he wants to: He simply
writes a method with an innocent name (that is not restricted) and calls that
from the restricted method (which he is entitled to override). And since he
may call his own method which does the same as the restricted one, viola...

> Although you probably want to leave final on the impl just to show it
> should be run, this would stop delegation in its tracks in this
> situation.
>
> In fact, the "its either private or totally public" stance of most
> languages these days is one of my dear issues, but that's another story.

That definitely is a problem, but I see no reason why we shouldn't change
this. The Eiffel approach to this lookes pretty flexible to me, for example.

> > I really don't see the problem. If you want it virtual, you hit "Go to
> > definition"  and insert the word "virtual". Nothing else needs to
> > change. If the editor is user-friendly, it's a matter of about 5 mouse
> > clicks and 7 or 8 keystrokes.
>
> The problem is the parent impl and subimpl might not be written by the
> same person.  The parent impl might be obfuscated, or beyond your
> access.  And there is often no reason for the keyword to be there in the
> first place.  If you shouldn't be overriding the method, ok, but it can
> often be done just for efficiency.  Non-overridable methods are the
> exception rather than the rule in my experience.

If you don't have access to the parent impl how can you be sure that your
overriding method does everything correctly? If the parent impl author marked
it explicitly as "virtual" however, you can be pretty sure that he took your
possible overriding into account. IMO it is better designed if the parent
author provides virtual "hooks" that you may or may not override (he may
provide a default function). This way can can make sure that his vital code
always gets executed.

Another possibility for the parent impl author is to restrict access to all
critical parts of the system. Maybe that is even better because (if designed
right) you can easily see where an overriding might possibly occur. If
designed right, it's more flexible as well, but optimisation will be more
difficult.

> > IMO the same thing holds true for a cMI approach. Because the task
> > is clear and the language has a feature ("transparent") which makes it
> > transparent to the source code, it can easily be done by the compiler.
> > Maybe you got some more options to choose among (Repeated inheritance
> > resolution, for example). If the compiler _asks_, at least the
> > potential problem of rep.inher. is communicated to the programmer.
>
> Well essentially stopping with errors is the compiler's way of asking.
> The programmer should know the best way to fix the program - the
> compiler might well not.

In a lot of cases the compiler should know the alternatives. Eg, "insert the
declaration or let it be". It's far less work for the user to hit a button
than to look up the error location, find the function header where to insert
the decl, remember (or guess) and write down the necessary type and the name.
And in the next compile cycle correct the typo he made in that declaration if
it was not his day. Well, for the time being this is just a vision like so
many other things, but one that I like...

> > IMO current compilers are not interactive enough. That's perfectly
> > understandable; noone wants to be asked the same questions over and
> > over again for each compilation because the compiler is not allowed to
> > change the source (to record that this warning has already been issued,
> > or example).
>
> Do you think there's anything that could be done better than my way of
> handling this (general problem)?  That is, issue warnings or errors to
> the user and let them right-click on the error to select a solution.  It
> should be easier to solve since the errors are centralised in the errors
> view rather than throughout the program.  I've tried to make it as easy
> as possible without actually making the compiler change the program.

The error *log* would always be a separate thing. And that surely is the place
where the user selects his solution. For his solution to take effect however
the source needs to be changed. If you leave that task to the programmer,
where's the innovation compared to the current way of handling it?

> > Oh I see. Maybe you should call that "coding time" to avoid confusion
> > with types (the interface thing). Well, that's exactly what
> > transparent names are for: To relieve you from typing the extra
> > qualification. If you declare Object2 as transparent, you may write
> > either version. Of course there must not be an ambiguity. Object may
> > not contain a name "Method", for example. Eiffel uses renaming in such
> > cases.
>
> Yes, I'm pretty sure I said that it was similar to transparent names in
> a previous post.  The only difference is that you can update methods as
> if they were fields.

By "update" do you mean "assign a value to them" ?If a method returns a
reference to the corresponding item, that return value can surely be used as
the left hand side of an assignment, but then why use a method at all instead
of directly accessing the item, which would be much faster?

> > Sure, but if it occurs at compile time and is resolved then (constant
> > folding), who cares?
>
> All I'm really disputing the use of "constant folding".  My understand
> of the term was it was taking a function with constant parameters and
> working out a constant result.  Field access could be seen as this.  But
> simply knowing the location at compile-time does not come under that
> definition, although you could still bring it under the heading of
> "constant determination" or whatever term you want to dream up.  But
> maybe I'm wrong ...

For me, "constant folding" means that the compiler can determine (somehow) the
way of that constant. Often this may ivolve a function call, like 1+3 or
ClassOf (x) where the parser knows the type of x but not the value. But it
could as well be a simple lookup of a constant.

Maybe constant determination is the better term. If the meaning is clear, I
don't mind the name.

Regards,

Hans-Dieter Dreier