How is C++'s multiple inheritance implemented?

How is C++'s multiple inheritance implemented? - c++

Single inheritance is easy to implement. For example, in C, the inheritance can be simulated as:
struct Base { int a; }
struct Descendant { Base parent; int b; }
But with multiple inheritance, the compiler has to arrange multiple parents inside newly constructed class. How is it done?
The problem I see arising is: should the parents be arranged in AB or BA, or maybe even other way? And then, if I do a cast:
SecondBase * base = (SecondBase *) &object_with_base1_and_base2_parents;
The compiler must consider whether to alter or not the original pointer. Similar tricky things are required with virtuals.

The following paper from the creator of C++ describes a possible implementation of multiple inheritance:
Multiple Inheritance for C++ - Bjarne Stroustrup

There was this pretty old MSDN article on how it was implemented in VC++.

And then, if I do a cast:
SecondBase base = (SecondBase *) object_with_base1_and_base2_parents;
The compiler must consider whether to alter or not the original pointer. Similar tricky things with virtuals.
With non-virutal inheritance this is less tricky than you might think - at the point where the cast is compiled, the compiler knows the exact layout of the derived class (after all, the compiler did the layout). Usually all that happens is a fixed offset (which may be zero for one of the base classes) is added/subtracted from the derived class pointer.
With virutal inheritance it is maybe a bit more complex - it may involve grabbing an offset from a vtbl (or similar).
Stan Lippman's book, "Inside the C++ Object Model" has very good descriptions of how this stuff might (and often actually does) work.

Parents are arranged in the order that they're specified:
class Derived : A, B {} // A comes first, then B
class Derived : B, A {} // B comes first, then A
Your second case is handled in a compiler-specific manner. One common method is using pointers that are larger than the platform's pointer size, to store extra data.

This is an interesting issue that really isn't C++ specific. Things get more complex also when you have a language with multiple dispatch as well as multiple inheritance (e.g. CLOS).
People have already noted that there are different ways to approach the problem. You might find reading a bit about Meta-Object Protocols (MOPs) interesting in this context...

Its entirely down to the compiler how it is done, but I beleive its generally done througha heirarchical structure of vtables.

I have performed simple experiment:
class BaseA { int a; };
class BaseB { int b; };
class Descendant : public BaseA, BaseB {};
int main() {
Descendant d;
BaseB * b = (BaseB*) &d;
Descendant *d2 = (Descendant *) b;
printf("Descendant: %p, casted BaseB: %p, casted back Descendant: %p\n", &d, b, d2);
}
Output is:
Descendant: 0xbfc0e3e0, casted BaseB: 0xbfc0e3e4, casted back Descendant: 0xbfc0e3e0
It's good to realise that static casting does not always mean "change the type without touching the content". (Well, when data types do not fit each other, then there will be also an interference into content, but it's different situation IMO).

Related

Trying to understand what is happening here with static_cast?

I'm working on an old C++ project, in the source there are two lines:
memcpy( static_cast<PLONADDRESS>(this), pa, sizeof(LONADDRESS) );
memcpy( static_cast<PLONIOFILTER)(this), pf, sizeof(LONIOFILTER) );
this is an object of type CLonFilterUnit, it is derived from public classes:
class CLonFilterUnit : public LONADDRESS, public LONIOFILTER
PLONADDRESS is:
typedef LONADDRESS* PLONADRESS;
PLONFILTER is:
typedef LONIOFILTER* PLONFILTER;
pa is of type PLONADDRESS and pf is of type PLONIOFILTER.
What I don't understand is how the same base address is used as the destination in both memcpy instructions? Is this permitted due to the way static_cast works?

When you have a class that is derived from multiple base classes those classes can be thought of as sub objects of the derived class. You will have a base1, base2, ..., baseN part of the derived object. When you static_cast a pointer to the derived class to a pointer of one of its base classes the cast will adjust the pointer to point to the correct base (sub object) of the object. You can see that with this little example:
struct foo
{
int a;
};
struct bar
{
int b;
};
struct foobar : foo, bar {};
int main() {
foobar f;
std::cout << static_cast<foo*>(&f) << "\t" << static_cast<bar*>(&f);
}
output:
0x7ffe250056c8 0x7ffe250056cc
live example
I would also like to point out that if your class is not trivially copyable then the code has undefined behavior as memcpy requires that.

static_cast does the necessary address adjustment.
The code (with memcpy, uppercase names, typedefs of pointers) is a great example of how to absolutely not do things. Maybe it's used as an example in a lecture series about how to lose your job quickly.

This is an example of some extremely dubious C++ code. I can hardly imagine why this code would be necessary - most likely, it is not.
However, to answer the question as asked, the way multiple inheritance works in C++ is by having different 'subobjects' of different base classes within the derived class. Those subobjects do not have the same address. By using static_cast on this, you select one subobject or another, and results of static_cast yield different addresses.

C++ casting oddness

Let's use this simple class hierarchy:
class A
{
public:
virtual void Af() {};
};
class IB
{
public:
virtual void Bf() = 0;
};
class C : public A, public IB
{
public:
virtual void Bf() {}
void Cf() { printf("Cf"); }
};
An now some tests I have done, trying to understand static_cast and dynamic_cast:
1) C* c = new C();
2) A* a = static_cast<A*>(c);
3) IB* ib = static_cast<IB*>(c); //ib gets a different pointer than c because ib vtable is assigned
4) A* correctA = static_cast<A*>(static_cast<C*>(ib)); //Correct, but I must cast first to C and the to A from Interface
5) A* incorrectA = static_cast<A*>(ib); //Compiler error
6) A* correctA2 = dynamic_cast<A*>(ib); //Correct result
Now, some questions:
1) I have started to code in C++ since I moved to C# about 5 years ago. I'm surprised of the "ib" variable value in number 3. I expected it to be same pointer as "c" variable but instead the cast is assigning the value of the vtable of class "ib" in "c"
2) Why must I cast fist to C* and then to A* in 3 to get a correct value? This makes polimorphism useless in this case. Because I want to cast from the interface to the base type without knowing the real type. 5 shows that this is not possible with static_cast (I guess that this is checking the inheritance tree and concluding IB interface is not related to A* but they really are at runtime.
3) 6 gets a correct value into correctA2. I guess it does this correclty as I explain in question 2 because this can be resolved only at runtime.
Could you explain a bit this kind of behaviours and confirm my guessings? It is hard to come back from c# to c++ :D.
Cheers.

It looks like you may be trying to write C# in C++ in which case I suggest just sticking with C#. However I'll try to answer your questions:
1) (note this is implementation details that are probably right on most systems) In a multiply inherited derived class typically an implementation will have multiple virtual tables as the first items in the object memory. In this case a C would have first an A vtable and then an IB vtable. If you try to use the derived pointer as IB without changing its address, the IB would be using the A class's vtable resulting in havoc. Thus, the compiler fixes up the address for you.
2) This is just the way the language tells us static_cast will work: converting between parent/child objects, and a few other relationships like different integral types. dynamic_cast is needed to traverse sibling relationships directly.
3) Correct, since dynamic_cast offers more flexibility for polymorphic conversions you can use it to convert between a sibling relationship.
I should make a closing remark that using multiple inheritance in C++ to provide an implementation to an interface is not a common pattern. There may be alternate approaches if you ask your real question.

A static_cast requires there to be a single compile time relationship between the types that is more direct than any other relationship.
Imagine you had also defined
class D : public IB, public A
The relationship between A and IB through D would be no more nor less direct than through C. A static_cast can use the fact that the most direct relationship between IB and C is IB as a base class of C and can use the fact that the most direct relationship between C and A is A as a base class of C. But the relationship between IB and A through C cannot be know to be the most direct compile time relationship, so static_cast can't use it (by dynamic_cast can use it as the only available run time relationship).

What is the order of fields in an inherited class?

struct A {int a;};
struct B : public A {int b;};
B b;
Does the Standard guarantee how are the fields ordered in memory i.e. does a come before b and is there any padding?
Use case. I have class Specification and class Command. The command objects include a specification in them, as well as some additional information. I want to be able to use an object of either type when a Specification is required.

You asked:
Does the Standard guarantee how are the fields ordered in memory i.e. does a come before b and is there any padding?
The standard does not make any such guarantee. It leaved those details to the implementation. From the C++ Draft Standard (N3337):
10 Derived Classes
5 The order in which the base class subobjects are allocated in the most derived object (1.8) is unspecified. [ Note: a derived class and its base class subobjects can be represented by a directed acyclic graph (DAG) where an arrow means “directly derived from.” A DAG of subobjects is often referred to as a “subobject
lattice.”
Base
^
|
Derived1
^
|
Derived2
6 The arrows need not have a physical representation in memory. — end note ]
While the standard does not guarantee it, I haven't seen an implementation that does not do what you are expecting. There may be some in the wild but I haven't seen them.

This sounds like what you want is polymorphism (either create a base class Specification or just some interface class, i.e. an abstract base class), whatever better fits your use case. Pretty much what you're doing already:
class Specification {
int a;
}
class Command : public Specificaion {
// int a; is essentially here, inherited from Specification
int b;
}
// later on:
Specification *mySpec = new Specification();
mySpec->a = 5; // valid
delete mySpec;
mySpec = new Command();
mySpec->a = 60; // valid as well
As long as you're working with pointers of the base class, you're basically going "from top to bottom" when looking up members. It's a bit more complicated than that, but since you're obviously using C++, you shouldn't just try casting unrelated types and hope that members are aligned the same way. If you need some way to read/write the classes to/from files, create some member to do the serialization in a clearly defined way.

Does 'a' come before 'b' - YES
Is there any padding? - Depends on compiler/pragma/alignment settings. And it depends on compiler/IDE how to setup packing/alignment (although new standards have some new keywords).

What is the advantage of using dynamic_cast instead of conventional polymorphism?

We can use Polymorphism (inheritance + virtual functions) in order to generalize different types under a common base-type, and then refer to different objects as if they were of the same type.
Using dynamic_cast appears to be the exact opposite approach, as in essence we are checking the specific type of an object before deciding what action we want to take.
Is there any known example for something that cannot be implemented with conventional polymorphism as easily as it is implemented with dynamic_cast?

Whenever you find yourself wanting a member function like "IsConcreteX" in a base class (edit: or, more precisely, a function like "ConcreteX *GetConcreteX"), you are basically implementing your own dynamic_cast. For example:
class Movie
{
// ...
virtual bool IsActionMovie() const = 0;
};
class ActionMovie : public Movie
{
// ...
virtual bool IsActionMovie() const { return true; }
};
class ComedyMovie : public Movie
{
// ...
virtual bool IsActionMovie() const { return false; }
};
void f(Movie const &movie)
{
if (movie.IsActionMovie())
{
// ...
}
}
This may look cleaner than a dynamic_cast, but on closer inspection, you'll soon realise that you've not gained anything except for the fact that the "evil" dynamic_cast no longer appears in your code (provided you're not using an ancient compiler which doesn't implement dynamic_cast! :)). It's even worse - the "self-written dynamic cast" approach is verbose, error-prone and repetitve, while dynamic_cast will work just fine with no additional code whatsoever in the class definitions.
So the real question should be whether there are situations where it makes sense that a base class knows about a concrete derived class. The answer is: usually it doesn't, but you will doubtlessly encounter such situations.
Think, in very abstract terms, about a component of your software which transmits objects from one part (A) to another (B). Those objects are of type Class1 or Class2, with Class2 is-a Class1.
Class1
^
|
|
Class2
A - - - - - - - -> B
(objects)
B, however, has some special handling only for Class2. B may be a completely different part of the system, written by different people, or legacy code. In this case, you want to reuse the A-to-B communication without any modification, and you may not be in a position to modify B, either. It may therefore make sense to explicitly ask whether you are dealing with Class1 or Class2 objects at the other end of the line.
void receiveDataInB(Class1 &object)
{
normalHandlingForClass1AndAnySubclass(object);
if (typeid(object) == typeid(Class2))
{
additionalSpecialHandlingForClass2(dynamic_cast<Class2 &>(object));
}
}
Here is an alternative version which does not use typeid:
void receiveDataInB(Class1 &object)
{
normalHandlingForClass1AndAnySubclass(object);
Class2 *ptr = dynamic_cast<Class2 *>(&object);
if (ptr != 0)
{
additionalSpecialHandlingForClass2(*ptr);
}
}
This might be preferable if Class2 is not a leaf class (i.e. if there may be classes further deriving from it).
In the end, it often comes down to whether you are designing a whole system with all its parts from the beginning or have to modify or adapt parts of it at a later stage. But if you ever find yourself confronted with a problem like the one above, you may come to appreciate dynamic_cast as the right tool for the right job in the right situation.

It allows you to do things which you can only do to the derived type. But this is usually a hint that a redesign is in order.
struct Foo
{
virtual ~Foo() {}
};
struct Bar : Foo
{
void bar() const {}
};
int main()
{
Foo * f = new Bar();
Bar* b = dynamic_cast<Bar*>(f);
if (b) b->bar();
delete f;
}

I can't think of any case where it's not possible to use virtual functions (other than such things as boost:any and similar "lost the original type" work).
However, I have found myself using dynamic_cast a few times in the Pascal compiler I'm currently writing in C++. Mostly because it's a "better" solution than adding a dozen virtual functions to the baseclass, that are ONLY used in one or two places when you already (should) know what type the object is. Currently, out of roughly 4300 lines of code, there are 6 instances of dynamic_cast - one of which can probably be "fixed" by actually storing the type as the derived type rather than the base-type.
In a couple of places, I use things like ArrayDecl* a = dynamic_cast<ArrayDecl*>(type); to determine that type is indeed an array declaration, and not someone using an non-array type as a base, when accessing an index (and I also need a to access the array type information later). Again, adding all the virtual functions to the base TypeDecl class would give lots of functions that mostly return nothing useful (e.g. NULL), and aren't called except when you already know that the class is (or at least should be) one of the derived types. For example, getting to know the range/size of an array is useless for types that aren't arrays.

No advantages really. Sometimes dynamic_cast is useful for a quick hack, but generally it is better to design classes properly and use polymorphism. There may be cases when due to some reasons it is not possible to modify the base class in order to add necessary virtual functions (e.g. it is from a third-party which we do not want to modify), but still dynamic_cast usage should be an exception, not a rule.
An often used argument that it is not convenient to add everything to the base class does not work really, since the Visitor pattern (see e.g. http://sourcemaking.com/design_patterns/visitor/cpp/2) solves this problem in a more organised way purely with polymorphism - using Visitor you can keep the base class small and still use virtual functions without casting.

dynamic_cast needs to be used on base class pointer for down cast when member function is not available in base class, but only in derived class. There is no advantage to use it. It is a way to safely down cast when virtual function is not overridden from base class. Check for null pointer on return value. You are correct in that it is used where there is no virtual function derivation.

Can 'this' pointer be different than the object's pointer?

I've recently came across this strange function in some class:
void* getThis() {return this;}
And later in the code it is sometimes used like so: bla->getThis() (Where bla is a pointer to an object of the class where this function is defined.)
And I can't seem to realize what this can be good for. Is there any situation where a pointer to an object would be different than the object's this (where bla != bla->getThis())?
It seems like a stupid question but I wonder if I'm missing something here..

Of course, the pointer values can be different! Below an example which demonstrates the issue (you may need to use derived1 on your system instead of derived2 to get a difference). The point is that the this pointer typically gets adjusted when virtual, multiple inheritance is involved. This may be a rare case but it happens.
One potential use case of this idiom is to be able to restore objects of a known type after storing them as void const* (or void*; the const correctness doesn't matter here): if you have a complex inheritance hierarchy, you can't just cast any odd pointer to a void* and hope to be able to restore it to its original type! That is, to easily obtain, e.g., a pointer to base (from the example below) and convert it to void*, you'd call p->getThis() which is a lot easier to static_cast<base*>(p) and get a void* which can be safely cast to a base* using a static_cast<base*>(v): you can reverse the implicit conversion but only if you cast back to the exact type where the original pointer came from. That is, static_cast<base*>(static_cast<void*>(d)) where d is a pointer to an object of a type derived from base is illegal but static_cast<base*>(d->getThis()) is legal.
Now, why is the address changing in the first place? In the example base is a virtual base class of two derived classes but there could be more. All subobjects whose class virtually inherits from base will share one common base subject in object of a further derived class (concrete in the example below). The location of this base subobject may be different relative to the respective derived subobject depending on how the different classes are ordered. As a result, the pointer to the base object is generally different from the pointers to the subobjects of classes virtually inheriting from base. The relevant offset will be computed at compile-time, when possible, or come from something like a vtable at run-time. The offsets are adjusted when converting pointers along the inheritance hierarchy.
#include <iostream>
struct base
{
void const* getThis() const { return this; }
};
struct derived1
: virtual base
{
int a;
};
struct derived2
: virtual base
{
int b;
};
struct concrete
: derived1
, derived2
{
};
int main()
{
concrete c;
derived2* d2 = &c;
void const* dptr = d2;
void const* gptr = d2->getThis();
std::cout << "dptr=" << dptr << " gptr=" << gptr << '\n';
}

No. Yes, in limited circumstances.
This looks like it is something inspired by Smalltalk, in which all objects have a yourself method. There are probably some situations in which this makes code cleaner. As the comments note, this looks like an odd way to even implement this idiom in c++.
In your specific case, I'd grep for actual usages of the method to see how it is used.

Your class can have custom operator& (so &a may not return this of a). That's why std::addressof exists.

I ran across something like this many (many many) years ago. If I recall correctly, it was needed when a class is manipulating other instances of the same class. One example might be a container class that can contain its own type/(class?).

That might be a way to override the this keyword.
Lets say that you have a memory pool, full initialized at the start of your program, for instance you know that at any time you can deal with a max of 50 messages, CMessage.
You create a pool at the size of 50 * sizeof(CMessage) (what ever this class might be), and CMessage implements the getThis function.
That way instead of overriding the new keyword you just override the "this", accessing the pool.
It can also mean that the object might be defined on different memory spaces, lets say on a SRAM, in boot mode, and then on a SDRAM.
It might be that the same instance will return different values for getThis through the program in such a situation, on purpose of course, when overriden.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How is C++'s multiple inheritance implemented? - c++

The following paper from the creator of C++ describes a possible implementation of multiple inheritance: Multiple Inheritance for C++ - Bjarne Stroustrup

There was this pretty old MSDN article on how it was implemented in VC++.

Its entirely down to the compiler how it is done, but I beleive its generally done througha heirarchical structure of vtables.

Related

Trying to understand what is happening here with static_cast?

C++ casting oddness

What is the order of fields in an inherited class?

What is the advantage of using dynamic_cast instead of conventional polymorphism?

Can 'this' pointer be different than the object's pointer?

Categories

Resources