How does C++ multiple inheritance virtual function access derived class field? - c++

Referencing the multiple inheritance memory layout, suppose Derived class has a field called int derived_only. If I have a Base1 * b1 and Base2 * b2, both pointing to the same Derived class object, then according to wiki, b1 and b2 have slightly different values due to pointer fixup. My question is, if I call a virtual function, say the virtual clone(), by using either b1 or b2, how does clone() calculate derived_only's address, from either b1 or b2?
Basically, when calling b1->clone() vs b2->clone(), the this pointer passed in is different, then how does clone() know how much offset to add to this to get to derived_only?

Related

C++ V-shape casting: vector<Base1*> to vector<Base2*>

I'm having real trouble to figure out this casting problem. Starting with 3 classes:
#include <vector>
// Pure virtual class
class Base1{
public:
virtual ~Base1();
virtual void do_sth()=0;
}
class Base2{
public:
int prop=3;
~Base2();
}
class Derived: public Base1, Base2{
~Derived();
void do_sth(){print("Hi");};
}
How can I perform the following conversion?
std::vector<Base1*> vec1
vec1.reserve(10);
for( int i = 0; i < 10; ++i )
vec1.push_back(new Derived());
// To this type...?
std::vector<Base2*> vec2 = ?????;
Some remarks:
I'd use dynamic_cast to perform safe casting from Base1 to Derived.
Ideally, no object copies are made in the process!.
My best bet so far is to call vec2.data() to obtain a Base1* pointer, then dynamic_cast to Derived, then static cast to Base2, but I don't know how to transfer memory ownership nor how to pass vector size.
The comments to the question have gotten rather muddled, so I'll post this partial answer here, rather than trying to straighten out the comments.
Base1 has a virtual function. Good start.
Derived is derived from Base1.
Derived is also derived from Base2.
If you have an object of type Derived you can create a pointer to Base1 that points at the derived object:
Derived d;
Base1* b1 = &d;
Now that you have a pointer to a polymorphic base class, you can use dynamic_cast to perform a cross-cast:
Base2* b2 = dynamic_cast<Base2*>(b1);
The compiler knows how to do that, and the result should be the same pointer value as you'd have gotten if you did it directly:
Base2* b2x = &d;
assert(b2x == b2);
Note, too, that since the code traffics in vectors of pointers, it seems that the Derived objects are being created with new Derived. If that's the case, and eventually the code deletes the Derived object through a pointer to one of the base types, then the base type must have a virtual destructor.
There are no Derived objects in vec1 nor vec2, so you can't get pointers to other base subobjects of a Derived from an element of either.
If pass a Derived to vec1.push_back, it will copy-construct a Base1 from the Base1 subobject of the Derived. This is called object slicing.
If you have a vector of pointers (raw or smart) to Base1 in vec1, then those could point to the Base1 base subobject of a Derived, at which point you can dynamic_cast<Base2*> them, which will in general give you a different pointer value.

Why don't the two pointer values be the same?

Quoting Effective C++, Scott Meyer, 3rd Edition , Item 27
class Base { ... };
class Derived: public Base { ... };
Derived d;
Base *pb = &d; // implicitly convert Derived* ⇒ Base*
Here we’re just creating a base class pointer to a derived class
object,
but sometimes, the two pointer values will not be the same. When that’s the case, an offset is applied at runtime to the Derived*
pointer to get the correct Base* pointer value.
Why are the two pointer values not the same? If it is because how the child and parent objects are laid out in the memory, then how does downcast work later?
This always happen when using multiple inheritance.
class Base1 { int a; };
class Base2 { double b };
class Derived : public Base1, public Base2 { ... };
Derived d;
Base1* pb1 = &d;
Base2* pb2 = &d;
Now &d cannot possibly be equal to both pb1 and pb2, because otherwise pb1 would equal pb2, which is not possible, because two different non-empty objects of unrelated types must occupy different regions of memory. So in at least one case a non-zero offset must be applied.
In most implementations with single inheritance the offset is zero, but the standard does not mandate that.
Indeed, a typical implementation would simply lay out the base object at the beginning of the derived object:
++-----++
||Base ||
|+-----+|
|Derived|
+-------+
But when there are more than one base, only one can go at the beginning:
++-----++
||Base1||
|+-----+|
||Base2||
|+-----+|
|Derived|
+-------+
The downcast works because the offset is fixed and known at compile time, so there's no problem to apply it when either upcasting or downcasting.
An exception to this is virtual inheritance. For a virtual base the offset is not known at compile time. Typically, the derived object contains an internal hidden pointer to its virtual base, so the upcast can work. But the base doesn't know where its derived object is, so the downcast cannot work, and is not allowed by the language.
When you have a single inheritance tree that is polymorphic, compilers require each object to begin with a VMT (virtual method table). If base class is polymorphic, the pointer values would match. But ff your base class is non-polymorphic and your derived class is polymorphic, the base class does not introduce the VMT, it is introduced by the very first polymorphic class down the tree. The VMT would be inserted before the base. Now the pointer values will not match.

is __vptr shared by all the objects of a class in C++? [duplicate]

Every class which contains one or more virtual function has a Vtable associated with it. A void pointer called vptr points to that vtable. Every object of that class contains that vptr which points to the same Vtable. Then why isn't vptr static ? Instead of associating the vptr with the object, why not associate it with the class ?
The runtime class of the object is a property of the object itself. In effect, vptr represents the runtime class, and therefore can't be static. What it points to, however, can be shared by all instances of the same runtime class.
Your diagram is wrong. There is not a single vtable, there is one vtable for each polymorphic type. The vptr for A points to the vtable for A, the vptr for A1 points to the vtable for A1 etc.
Given:
class A {
public:
virtual void foo();
virtual void bar();
};
class A1 : public A {
virtual void foo();
};
class A2 : public A {
virtual void foo();
};
class A3 : public A {
virtual void bar();
virtual void baz();
};
The vtable for A contains { &A::foo, &A::bar }
The vtable for A1 contains { &A1::foo, &A::bar }
The vtable for A2 contains { &A2::foo, &A::bar }
The vtable for A3 contains { &A::foo, &A3::bar, &A3::baz }
So when you call a.foo() the compiler follows the object's vptr to find the vtable then calls the first function in the vtable.
Suppose a compiler uses your idea, and we write:
A1 a1;
A2 a2;
A& a = (std::rand() % 2) ? a1 : a2;
a.foo();
The compiler looks in the base class A and finds the vptr for the class A which (according to your idea) is a static property of the type A not a member of the object that the reference a is bound to. Does that vptr point to the vtable for A, or A1 or A2 or something else? If it pointed to the vtable for A1 it would be wrong 50% of the time when a refers to a2, and vice versa.
Now suppose that we write:
A1 a1;
A2 a2;
A& a = a1;
A& aa = a2;
a.foo();
aa.foo();
a and aa are both references to A, but they need two different vptrs, one pointing to the vtable for A1 and one pointing to the vtable for A2. If the vptr is a static member of A how can it have two values at once? The only logical, consistent choice is that the static vptr of A points to the vtable for A.
But that means the call a.foo() calls A::foo() when it should call A1::foo(), and the call aa.foo() also calls A::foo() when it should call A2::foo().
Clearly your idea fails to implement the required semantics, proving that a compiler using your idea cannot be a C++ compiler. There is no way for the compiler to get the vtable for A1 from a without either knowing what the derived type is (which is impossible in general, the reference-to-base could have been returned from a function defined in a different library and could refer to a derived type that hasn't even been written yet!) or by having the vptr stored directly in the object.
The vptr must be different for a1 and a2, and must be accessible without knowing the dynamic type when accessing them through a poiner or reference to base, so that when you obtain the vptr through the reference to the base class, a, it still points to the right vtable, not the base class vtable. The most obvious way to do this is to store the vptr directly in the object. An alternative, more complicated solution would be to keep a map of object addresses to vptrs, e.g. something like std::map<void*, vtable*>, and find the vtable for a by looking up &a, but this still stores one vptr per object not one per type, and would require a lot more work (and dynamic allocation) to update the map every time polymorphic objects are created and destroyed, and would increase overall memory usage because the map structure would take up space. It's simpler just to embed the vptr in the objects themselves.
The virtual table (which is, by the way, an implementation mechanism not mentioned in the C++ standard) is used to identify the dynamic type of an object at runtime. Therefore, the object itself must hold a pointer to it. If it was static, then only the static type could be identified by it and it would be useless.
If you are thinking of somehow using typeid() internally to identify the dynamic type and then call the static pointer with it, be aware that typeid() only returns the dynamic type for objects belonging to types with virtual functions; otherwise it just returns the static type (§ 5.2.8 in the current C++ standard). Yes, this means that it works the other way around: typeid() typically uses the virtual pointer to identify the dynamic type.
As everyone attest Vptr is a property of an object.
Lets see why?
Assume we have three objects
Class Base{
virtual ~Base();
//Class Definition
};
Class Derived: public Base{
//Class Definition
};
Class Client: public Derived{
//Class Definition
};
holding relation Base<---Derived<----Client.
Client Class is derived from Derived Class which is in turn derived from Base
Base * Ob = new Base;
Derived * Od = new Derived;
Client* Oc = new Client;
Whenever Oc is destructed it should destruct base part, derived part and then client part of the data. To aid in this sequence Base destructor should be virtual and object Oc's destructor is pointing to Client's destructor. When object Oc's base destructor is virtual compiler adds code to destructor of object Oc to call derived's destructor and derived destructor to call base's destructor. This chaining sees all the base, derived and client data is destructed when Client object is destroyed.
If that vptr is static then Oc's vtable entry will still be pointing to Base's destructor and only base part of Oc is destroyed. Oc's vptr should always point to most derived object's destructor, which is not possible if vptr is static.
The whole point of the vptr is because you don't know exactly which class an object has at runtime. If you knew that, then the virtual function call would be unnecessary. That is, in fact, what happens when you're not using virtual functions. But with virtual functions, if I have
class Sub : Parent {};
and a value of type Parent*, I don't know at runtime if this is really an object of type Parent or one of type Sub. The vptr lets me figure that out.
virtual method table is per class. An object contains a pointer to the run-time type vptr.
I don't think this is a requirement in the standard bust all compiles that I've worked with do it this way.
This is true even in you example.
#Harsh Maurya: Reason might be , Static member variables must be defined before Main function in the program. But if we want _vptr to be static, whose responsibility ( compiler/programmer ) to define the _vptr in the program before main. And how programmer knows the pointer of VTABLE to assign it to _vptr. Thats why compiler took that responsibility to assign the value to pointer(_vptr). This happens in Constructor of class(Hidden functionality). And now if Constructor comes into picture there should be one _vptr for each object.

vtable and polymorphism - offset of a function

If I understand things correctly, a class definition imposes a certain order of the virtual functions in the vtable, and so a given function is known to be at a certain offset from the beginning of the table. However, I don't understand how that works with polymorphism.
class B1 {
virtual void funcB1();
};
class B2 {
virtual void funcB2() {}
};
class D : public B1, public B2 {
virtual void funcB1() {}
virtual void funcB2() {}
};
void main(...) {
B1 *b1 = new D();
B2 *b2 = new D();
B1 *realB1 = new B1();
B2 *realB2 = new B2();
b1->funcB1();
b2->funcB2();
realB1->funcB1();
realB2->funcB2();
}
How does the generated code know how to access funcB2 at different offsets?
When you compose a class from two base classes, each part is represented in the resultant class by a fully functioning block, complete with its own pointer to vtable. That is how the generated code knows what function to call: casting the pointer of D to B1 and B2 produces different pointers, so the generated code can use the same offset into the virtual table.
D *d = new D();
B1 *b1 = dynamic_cast<B1*>(d);
B2 *b2 = dynamic_cast<B2*>(d);
printf("%p %p %p", (void*)d, (void*)b1, (void*)b2);
This produces the following output on ideone:
0x91c7008 0x91c7008 0x91c700c
Note how D* and B1* print the same value, while B2* prints a different value. When you call b2->funcB2(), the pointer b2 already points to a different part of the D object, which points to a different vtable (the one that has the layout of B2), so the generated code does not need to do anything differently for b2 vs realB2 in your example.
Typically the D object will have two vtable pointers, one for each base class. It really can't be avoided, since it must contain an identical binary layout for each of the base classes. The compiler will insert pointer fixups whenever you cast from one type to another - if you print the pointer addresses after casting to each of the base classes, you'll see that they are different.

Getting the pointer in C++ for a class that is part of another class with multiple inheritance

I have some classes that inherit from each other but they do so using templates. What I want is to effectively get a pointer and/or reference to one of the base classes as if it is one of the other possible derived classes dependant upon the templates
class a1
{
public:
int a;
virtual void func()
{
}
// other non virtual functions ...
};
class b1
{
public:
//no members or virtual functions
//other non virtual functions ...
};
class a2
{
public:
int a;
// ...
};
template < class T1 >
class derived : public T1,
public a2
{
int a;
// ...
};
Class derived can either inherit from class a1 or class b1, this is mostly to save space in derived as b1 is a blank class and so when derived is instanciated with template paramater b1 it is not carrying the extra load of the data members and virtual functions of a1.
However I now want to get a pointer or reference from derived(a1) that is really a pointer or reference for a type derived(b1).
What i'm really asking for is help on a "good" way of doing offsetof() but using inheritance where I can get the offsetof() a2, this I am assuming is a good pointer for derived(b1) because b1 is a blank class.
I have tried to get the pointer of derived(a1) object then add on the sizeof(a1) with the hopes that this will be the correct position but wanted to know if anyone else had suggestions of a better way.
As far as I understand you, you have e.g. a pointer to derived<a1>, and want a pointer to a1. Since a1 is a direct base class of derived<a1>, you can obtain this pointer by direct implicit casting:
derived<a1>* instance = whatever();
a1* pointer = instance;
It is however recommended that you make the cast explicit. Since this class is always safe and can be resolved at compile-time, use static_cast.
a1* pointer = static_cast<a1*>(instance);
Executive summary: Pointer arithmetics is something you should not do for traversing class hierarchies. There are static_cast and dynamic_cast available for exactly this purpose: They will warn you or error out when you try to do something dangerous, and generally have much more knowledge about the exact memory layout than you can ever have.
EDIT: You edited the question to say that you want to cast from derived<a1> to derived<b11>. This is not possible. static_cast and dynamic_cast do not support operations that change the memory layout of instances. Any pointer arithmetic is strongly advised against because you cannot know how the compiler arranges the data fields of instances in memory.
Have class b1 as the base class of class a1
If all you want to do is to save memory space for some of your objects then templates are probably not the best tool for that.
As b1 is empty derived<b1> adds nothing useful to a2, so why not using a simple inheritance class a1 : public a2 ? You can instantiate objects from either a1 or a2 depending if you need the additional data and they can all be casted to a2 (for example, if you want to store them in a list).
If you weren't using templates and just Multiple Inheritance, assuming that d is an instance of type Derived but is referenced as A1 you could have.
A1* a = new Derived();
Derived* d = (Derived*)a;
B2* b = d;
The template complicates things though.