C++ FAQs item 20.05:
"Virtual base classes are special, their destructors are called at the
end of the most derived class' destructor (only)."
I dont really understand how this fits in with the typical:
"data member destructors first, then base class destructors" rule
How are virtual base classes special? I cannot tell what the above means :s
The key property of virtual base classes is that they always produce a single unique base subobject in any object of derived class. That's exactly what's special about virtual base classes and that makes them different from regular base classes, which can produce multiple subobjects.
For example, in this hierarchy
struct B {};
struct M1 : B {};
struct M2 : B {};
struct D : M1, M2 {}
there's no virtual inheritance. All bases are inherited using regular inheritance. In this case class D will contain two independent subobjects of type B: one brought in by M1, another - by M2.
+-> D <-+
| |
M1 M2
^ ^
| |
B B <- there are two different `B`s in `D`
The task of properly destructing all subobjects when destructing D is trivial: each class in the hierarchy is responsible of destructing its direct bases and only its direct bases. That simply means that destructor of M1 calls the destructor of its own B subobject, destructor of M2 calls the destructor of its own B subobject, while destructor of D calls the destructors of its M1 and M2 subobjects.
Everything works out nicely in the above destruction schedule. All subobjects get destructed, including both subobjects of type B.
However, once we switch to virtual inheritance things become more complicated
struct B {};
struct M1 : virtual B {};
struct M2 : virtual B {};
struct D : M1, M2 {}
Now there only one subobject of type B in D. Both M1 and M2 see and share the same subobject of type B as their base.
+-> D <-+
| |
M1 M2
^ ^
| |
+-- B --+ <- there is only one `B` in `D`
If we make a naive attempt to apply the previous destruction schedule to this hierarchy, we'll end up with the B subobject getting destructed twice: M1 calls the destructor of B subobject, and M2 calls the destructor of the very same B subobject.
That is, of course, completely unacceptable. Each subobject has to be destructed once and only once.
In order to solve this problem, when M1 and M2 are used as base subobjects of D, these are explicitly prohibited to call the destructor of their B subobject. The responsibility of calling the destructor of B is assigned to D's destructor. Class D, when used as a complete independent object (i.e. serves as a most derived class), knows that there's only one B in it and knows that the destructor of B has to be called only once. So, the destructor of class D will call the destructor of B for that unique base subobject of type B. Meanwhile, destructors of M1 and M2 will not even attempt to call the destructor of B.
That's how it works with virtual inheritance. And that what the rule you quoted says basically. The parts that says that virtual bases' destructors are called last simply means that each class'e destructor calls destructors for its direct regular base classes, and only after that, if necessary, it calls destructors of its virtual base classes (possibly indirect). In the above example, the destructor of D calls destructors of M1 and M2 and only after that it calls the destructor of B.
The entire paragraph of the book you are quoting was describing the order of destructors. Normally, in the class declaration, the order of the classes listed for inheritance determines the order of their construction, and then they are destructed in reverse order.
A virtual base class means virtual inheritance was used on it:
struct Base {};
struct D : virtual Base {};
struct D1 : D, virtual Base {};
struct D2 : virtual Base, D {};
ASCII art alert:
Base Base
| \ | \
/_\ \ | /_\
| \ | \
D /_\ | D
| / | /
/_\ / /_\ /_\
| / | /
D1 D2
The multiple inheritance collapses the diamond into a single line, in this case. But, the point is still illustrated. The order of inheritance for D1 and D2 doesn't matter. The order in which the destructors for D and Base are called will be the same for both of them.
Related
If I have a class inheritance relation like the following
a
/ \
b c
\ |
| d
\/ \
e f
\ /
g
Is the following the correct definition?
class A {};
class B: public virtual A {};
class C: public virtual A {};
class D: public C {};
class E: public B, public virtual D {};
class F: public virtual D {};
class G: public E, public F {};
I made both A and D virtually inherited because I assume each joint class need to be virtual.
Also I am not sure how C++ defines the constructor order of the above case. The link
https://isocpp.org/wiki/faq/multiple-inheritance#mi-vi-ctor-order says
The very first constructors to be executed are the virtual base
classes anywhere in the hierarchy. They are executed in the order they
appear in a depth-first left-to-right traversal of the graph of base
classes, where left to right refer to the order of appearance of base
class names.
After all virtual base class constructors are finished, the
construction order is generally from base class to derived class. The
details are easiest to understand if you imagine that the very first
thing the compiler does in the derived class’s ctor is to make a
hidden call to the ctors of its non-virtual base classes (hint: that’s
the way many compilers actually do it). So if class D inherits
multiply from B1 and B2, the constructor for B1 executes first, then
the constructor for B2, then the constructor for D. This rule is
applied recursively; for example, if B1 inherits from B1a and B1b, and
B2 inherits from B2a and B2b, then the final order is B1a, B1b, B1,
B2a, B2b, B2, D.
Note that the order B1 and then B2 (or B1a then B1b) is determined by
the order that the base classes appear in the declaration of the
class, not in the order that the initializer appears in the derived
class’s initialization list.
If so, is the order like?
A, D, B, C, D, F, G
I do not expect D is constructed before C. What is the correct constructor order?
When constructing an object of class G, the virtual base classes are initialized first. This results in the constructors for A and D being called. The constructor for D will construct it's base C object. Then the constructors for the non-base classes of G will be called, which would be E and F. The constructor for E will call the constructor for B. This gives the final order:
A C D B E F G
I added constructors that call printf to watch what happens (code available on Godbolt for the curious). The order I got, with both clang (version 6.0.0) and gcc (versions 6.4.0 and 7.3.0), is:
A
C
D
B
E
F
G
This matches what I'd expect. A and D have to be constructed first, but you can't construct a D without a C (imagine a more complex example where D's constructor called into C. Once all virtual inheritance has been satisfied, the constructors are called in the same order we expect.
Here http://www.parashift.com/c++-faq-lite/multiple-inheritance.html section [25.14] says
The very first constructors to be executed are the virtual base classes anywhere in the hierarchy.
I tried to verify it using following program:
A (pure virtual)
|
B
|
C
(virtual)/ \ (virtual)
E D
\ /
F
|
G (pure virtual)
|
H
each class has a c'tor and virtual d'tor. the output is as follows:
A
B
C
E
D
F
G
H
~H
~G
~F
~D
~E
~C
~B
~A
Press any key to continue . . .
but as per quote virtual base classes constructors should be executed first.
what did I understand wrong?
EDIT: To clear my question, As per my understanding this behaviour has nothing to do with whether a base class is virtual or not. but quote insists on Virtual Base class. am I clear or something fishy there?
Virtual base classes cannot be constructed if the classes they inherit from are not constructed first. So in your case, non-virtual base classes are constructed because the virtual ones depend on them: C can't be constructed until A and Bare. Therefore, A and B are indeed constructed before C, even though C is virtually inherited.
struct B { int i; };
struct D1 : virtual B {};
struct D2 : B {}; // <-- not virtual
struct DD : D1, D2 {};
Having coded above, still the compiler demands D2 also to be virtual:
DD d;
d.i = 0; // error: request for member `i' is ambiguous
What I don't understand is, once you have prompted compiler that B is virtual with respect to DD (via D1) then why it still i is ambiguous ?
(If my memory serves correct, the older VC++ (in 2006), was capable enough to make out this just with single virtual inheritance)
B is not virtual with respect to DD - it is virtual with respect to D1. At the time D2 is created, it contains a full copy of B. So now DD has two implementations of B: one as part of D2, and one at the end (pointed by D1). And having two copies of i, using it is indeed ambiguous.
Had D2 also used virtual inheritance, instead of containing a copy of B, it would have contained a pointer to the instance of B that D1 is also pointing at, and DD would have contained only one instance of B.
I'll try to illustrate the memory layouts, hope this comes out right...:
Your case, with one virtual inheritance and one non-virtual -
| D1 | D2 + B | B |
+--+-------+----------+---------+
| vptr to B ^
+-----------------------|
Having both D1 and D2 inherit virtually -
| D1 | D2 | B |
+--+-----+---+----+-------+
| | ^
+---------+---------|
You must read Diamond problem . Under the Approaches heading, for CPP, your case is clearly mentioned, your observation matches the one explained there.
The standard requires that d.i must be ambiguous in this case. ISO/IEC 14882:2003 section 10.1.6 covers classes with virtual and non-virtual base classes of a given type.
I am trying to make sense of the statement in book effective c++. Following is the inheritance diagram for multiple inheritance.
Now the book says separate memory in each class is required for vptr. Also it makes following statement
An oddity in the above diagram is that there are only three vptrs even though four classes are involved. Implementations are free to generate four vptrs if they like, but three suffice (it turns out that B and D can share a vptr), and most implementations take advantage of this opportunity to reduce the compiler-generated overhead.
I could not see any reason why there is requirement of separate memory in each class for vptr. I had an understanding that vptr is inherited from base class whatever may be the inheritance type. If we assume that it shown resultant memory structure with inherited vptr how can they make the statement that
B and D can share a vptr
Can somebody please clarify a bit about vptr in multiple inheritance?
Do we need separate vptr in each class ?
Also if above is true why B and D can share vptr ?
Your question is interesting, however I fear that you are aiming too big as a first question, so I will answer in several steps, if you don't mind :)
Disclaimer: I am no compiler writer, and though I have certainly studied the subject, my word should be taken with caution. There will me inaccuracies. And I am not that well versed in RTTI. Also, since this is not standard, what I describe are possibilities.
1. How to implement inheritance ?
Note: I will leave out alignment issues, they just mean that some padding could be included between the blocks
Let's leave it out virtual methods, for now, and concentrate on how inheritance is implemented, down below.
The truth is that inheritance and composition share a lot:
struct B { int t; int u; };
struct C { B b; int v; int w; };
struct D: B { int v; int w; };
Are going to look like:
B:
+-----+-----+
| t | u |
+-----+-----+
C:
+-----+-----+-----+-----+
| B | v | w |
+-----+-----+-----+-----+
D:
+-----+-----+-----+-----+
| B | v | w |
+-----+-----+-----+-----+
Shocking isn't it :) ?
This means, however, than multiple inheritance is quite simple to figure out:
struct A { int r; int s; };
struct M: A, B { int v; int w; };
M:
+-----+-----+-----+-----+-----+-----+
| A | B | v | w |
+-----+-----+-----+-----+-----+-----+
Using these diagrams, let's see what happens when casting a derived pointer to a base pointer:
M* pm = new M();
A* pa = pm; // points to the A subpart of M
B* pb = pm; // points to the B subpart of M
Using our previous diagram:
M:
+-----+-----+-----+-----+-----+-----+
| A | B | v | w |
+-----+-----+-----+-----+-----+-----+
^ ^
pm pb
pa
The fact that the address of pb is slightly different from that of pm is handled through pointer arithmetic automatically for you by the compiler.
2. How to implement virtual inheritance ?
Virtual inheritance is tricky: you need to ensure that a single V (for virtual) object will be shared by all the other subobjects. Let's define a simple diamond inheritance.
struct V { int t; };
struct B: virtual V { int u; };
struct C: virtual V { int v; };
struct D: B, C { int w; };
I'll leave out the representation, and concentrate on ensuring that in a D object, both the B and C subparts share the same subobject. How can it be done ?
Remember that a class size should be constant
Remember that when designed, neither B nor C can foresee whether they will be used together or not
The solution that has been found is therefore simple: B and C only reserve space for a pointer to V, and:
if you build a stand-alone B, the constructor will allocate a V on the heap, which will be handled automatically
if you build B as part of a D, the B subpart will expect the D constructor to pass the pointer to the location of V
And idem for C, obviously.
In D, an optimization allow the constructor to reserve space for V right in the object, because D does not inherit virtually from either B or C, giving the diagram you have shown (though we don't have yet virtual methods).
B: (and C is similar)
+-----+-----+
| V* | u |
+-----+-----+
D:
+-----+-----+-----+-----+-----+-----+
| B | C | w | A |
+-----+-----+-----+-----+-----+-----+
Remark now that casting from B to A is slightly trickier than simple pointer arithmetic: you need follow the pointer in B rather than simple pointer arithmetic.
There is a worse case though, up-casting. If I give you a pointer to A how do you know how to get back to B ?
In this case, the magic is performed by dynamic_cast, but this require some support (ie, information) stored somewhere. This is the so called RTTI (Run-Time Type Information). dynamic_cast will first determine that A is part of a D through some magic, then query D's runtime information to know where within D the B subobject is stored.
If we were in case where there is no B subobject, it would either return 0 (pointer form) or throw a bad_cast exception (reference form).
3. How to implement virtual methods ?
In general virtual methods are implemented through a v-table (ie, a table of pointer to functions) per class, and v-ptr to this table per-object. This is not the sole possible implementation, and it has been demonstrated that others could be faster, however it is both simple and with a predictable overhead (both in term of memory and dispatch speed).
If we take a simple base class object, with a virtual method:
struct B { virtual foo(); };
For the computer, there is no such things as member methods, so in fact you have:
struct B { VTable* vptr; };
void Bfoo(B* b);
struct BVTable { RTTI* rtti; void (*foo)(B*); };
When you derive from B:
struct D: B { virtual foo(); virtual bar(); };
You now have two virtual methods, one overrides B::foo, the other is brand new. The computer representation is akin to:
struct D { VTable* vptr; }; // single table, even for two methods
void Dfoo(D* d); void Dbar(D* d);
struct DVTable { RTTI* rtti; void (*foo)(D*); void (*foo)(B*); };
Note how BVTable and DVTable are so similar (since we put foo before bar) ? It's important!
D* d = /**/;
B* b = d; // noop, no needfor arithmetic
b->foo();
Let's translate the call to foo in machine language (somewhat):
// 1. get the vptr
void* vptr = b; // noop, it's stored at the first byte of B
// 2. get the pointer to foo function
void (*foo)(B*) = vptr[1]; // 0 is for RTTI
// 3. apply foo
(*foo)(b);
Those vptrs are initialized by the constructors of the objects, when executing the constructor of D, here is what happened:
D::D() calls B::B() first and foremost, to initiliaze its subparts
B::B() initialize vptr to point to its vtable, then returns
D::D() initialize vptr to point to its vtable, overriding B's
Therefore, vptr here pointed to D's vtable, and thus the foo applied was D's. For B it was completely transparent.
Here B and D share the same vptr!
4. Virtual tables in multi-inheritance
Unfortunately this sharing is not always possible.
First, as we have seen, in the case of virtual inheritance, the "shared" item is positionned oddly in the final complete object. It therefore has its own vptr. That's 1.
Second, in case of multi-inheritance, the first base is aligned with the complete object, but the second base cannot be (they both need space for their data), therefore it cannot share its vptr. That's 2.
Third, the first base is aligned with the complete object, thus offering us the same layout that in the case of simple inheritance (the same optimization opportunity). That's 3.
Quite simple, no ?
If a class has virtual members, one need to way to find their address. Those are collected in a constant table (the vtbl) whose address is stored in an hidden field for each object (vptr). A call to a virtual member is essentially:
obj->_vptr[member_idx](obj, params...);
A derived class which add virtual members to his base class also need a place for them. Thus a new vtbl and a new vptr for them. A call to an inherited virtual member is still
obj->_vptr[member_idx](obj, params...);
and a call to new virtual member is:
obj->_vptr2[member_idx](obj, params...);
If the base is not virtual, one can arrange for the second vtbl to be put immediately after the first one, effectively increasing the size of the vtbl. And the _vptr2 is no more needed. A call to a new virtual member is thus:
obj->_vptr[member_idx+num_inherited_members](obj, params...);
In the case of (non virtual) multiple inheritance, one inherit two vtbl and two vptr. They can't be merged, and calls must pay attention to add an offset to the object (in order for the inherited data members to be found at the correct place). Calls to the first base class members will be
obj->_vptr_base1[member_idx](obj, params...);
and for the second
obj->_vptr_base2[member_idx](obj+offset, params...);
New virtual members can again either be put in a new vtbl, or appended to the vtbl of the first base (so that no offsets are added in future calls).
If a base is virtual, one can not append the new vtbl to the inherited one as it could leads to conflicts (in the example you gave, if both B and C append their virtual functions, how D be able to build its version?).
Thus, A needs a vtbl. B and C need a vtbl and it can't be appended to A's one because A is a virtual base of both. D needs a vtbl but it can be appended to B one as B is not a virtual base class of D.
It all has to do with how compiler figures out the actual addresses of method functions. The compiler assumes that virtual table pointer is located at a known offset from the base of the object (typically at offset 0). The compiler also needs to know the structure of the virtual table for each class - in other words, how to lookup pointers to functions in the virtual table.
Class B and class C will have completely different structures of Virtual Tables since they have different methods. Virtual table for class D can look like a virtual table for class B followed by additional data for methods of class C.
When you generate an object of class D, you can cast it as a pointer to B or as a pointer to C or even as a pointer to class A. You may pass these pointers to modules that are not even aware of existence of class D, but can call methods of class B or C or A. These modules need to know how to locate the pointer to the virtual table of the class and they need to know how to locate pointers to methods of class B/C/A in the virtual table. That's why you need to have separate VPTRs for each class.
Class D is well aware of existence of class B and the structure of its virtual table and therefore can extend its structure and reuse the VPTR from object B.
When you cast a pointer to object D to a pointer to object B or C or A, it will actually update the pointer by some offset, so that it starts from vptr corresponding to that specific base class.
I could not see any reason why there
is requirement of separate memory in
each class for vptr
At runtime, when you invoke a (virtual) method via a pointer, the CPU has no knowledge about the actual object on which the method is dispatched. If you have B* b = ...; b->some_method(); then the variable b can potentially point at an object created via new B() or via new D() or
even new E() where E is some other class that inherits from (either) B or D. Each of these classes can supply its own implementation (override) for some_method(). Thus, the call b->some_method() should dispatch the implementation from either B, D or E depending on the object on which b is pointing.
The vptr of an object allows the CPU to find the address of the implementation of some_method that is in effect for that object. Each class defines it own vtbl (containing addresses of all virtual methods) and each object of the class starts with a vptr that points at that vtbl.
I think D needs 2 or 3 vptrs.
Here A may or may not require a vptr.
B needs one that should not be shared with A (because A is virtually inherited).
C needs one that should not be shared with A (ditto).
D can use B or C's vftable for its new virtual functions (if any), so it can share B's or C's.
My old paper "C++: Under the Hood" explains the Microsoft C++ implementation of virtual base classes. http://www.openrce.org/articles/files/jangrayhood.pdf
And (MS C++) you can compile with cl /d1reportAllClassLayout to get a text report of class memory layouts.
Happy hacking!
why this is happen ?
When u create abstract class in c++ Ex: Class A (which has a pure virtual function)
after that class B is inherited from class A
And if class A has constructor called A()
suppose i created an Object of class B then the compiler initializes the base class first i.e.class A and then initialize the class B Then.......?
First thing is we can not access a constructor of any class without an Object then how it is initialize the constructor of abstract class if we can not create an object of abstract class .
Quick answer: constructors are special.
When the constructor of A is still running, then the object being constructed is not yet truly of type A. It's still being constructed. When the constructor finishes, it's now an A.
It's the same for the derived B. The constructor for A runs first. Now it's an A. Then the constructor for B starts running. During this, the object is still really an A. Only when B's constructor finishes does it become a B.
You can verify this by trying to call the pure virtual function from the constructors. If the function is defined in A, and B's constructor calls it, there will be a runtime error instead of running B's override, because the object is not of type B yet.
The compiler will not allow you to generate code that will construct an A, due to the pure virtual function. But it will generate code to construct an A as part of the process of constructing a B. There's no magic involved in this. The rule that you cannot construct an A is imposed by the language rules, not by physics. The language lifts that rule under the special circumstance of constructing objects of B.
class A is abstract but class B is not. In order to construct class B, it must implement all the pure virtual member functions of class A.
class A
{
public:
A() {}
virtual ~A() {}
virtual void foo() = 0; // pure virtual
int i;
};
class B : public A
{
public:
B() {}
virtual ~B() {}
virtual void foo() {}
int j;
};
The A class layout could be something like this:
+---------+ +---------+
| vftable | --> | ~A() | --> address of A::~A()
+---------+ +---------+
| i | | foo() | --> NULL, pure virtual
+---------+ +---------+
The B class layout could be something like this:
+---------+ +---------+
| vftable | --> | ~B() | --> address of B::~B()
+---------+ +---------+
| i | | foo() | --> address of B::foo()
+---------+ +---------+
| j |
+---------+
struct A {
A(int x) {..}
virtual void do() = 0;
};
struct B : public A {
B() : A(13) {} // <--- there you see how we give params to A c'tor
virtual void do() {..}
};
And if class A has constructor called A() suppose i created an
Object of class B then the compiler initializes the base class
first i.e.class A and then initialize the class B
Then.......?
Actually you have it the wrong way around:
When you create an object of class B the constructor of B is called.
If you do not specify how the B constructor calls A constructor then the compiler will automatically insert as the first action in the initializer list a call to the default constructor of A.
If you do not want to use the default constructor you must explicitly put the call to the appropriate A constructor as the first element in the initializer list.
When the construction of A is complete the construction of B will continue.
First thing is we can not access a constructor of any class without an Object
then how it is initialize the constructor of abstract class if we can not create
an object of abstract class .
You word the above as if you consider A and B different things. An object of class B is also an object of class A. It is the object as a whole that is valid. The whole object is of class B but this contains (as part of the same object) all the information that was from class A.
Just because you can't instantiate class A directly doesn't mean it's impossible to instantiate class A. You're not allowed to instantiate A because the compiler knows that A is abstract and rejects any code you write that attempts to directly instantiate A. It forbids code like this:
A a;
new A();
What makes a class abstract is that it has pure virtual methods. There is nothing inherently preventing such a class from being instantiated, though. The C++ standard simply says it's not allowed. The compiler is perfectly capable of generating instructions to instantiate an abstract class. All it has to do is reserve the right amount of memory and then invoke the constructor, the same as it would for a non-abstract class.
When you instantiate B, all the memory for the class gets allocated at once. Since all the bytes are there, there is essentially an A instance there, ready to be initialized by the constructor. (But note that the memory isn't formally considered an object of type A until after the A constructor has finished running.) The A constructor runs, and then the B constructor runs.