While learning the "Effective C++", I was firstly surprised when I learned the fact that if a class had multiple inheritance, its pointer may take offset when the pointer casting is done. Although it was not easy concept to grasp, but I think I managed to get it.
However, the author claims that this offset might happen even in the pointer casting of singly inherited class. I wonder what would be the such case, and wish to know the rationale behind it.
This can happen when polymorphism is introduced into a class hierarchy by a derived class. Consider the following class:
struct Foo
{
int a;
int b;
};
This class is not polymorphic, and thus the implementation does not need to include a pointer to a virtual dispatch table (a commonly-used method of implementing virtual dispatch). It will be laid out in memory like this:
Foo
+---+
a | |
+---+
b | |
+---+
Now consider a class that inherits from Foo:
struct Bar : Foo
{
virtual ~Bar() = default;
};
This class is polymorphic, and so objects of this class need to include a pointer to a vtable so further derived classes can override Bar's virtual member functions. That means that Bar objects will be laid out in memory like this:
Bar
+---------+
vtable pointer | |
+---------+
Foo subobject | +---+ |
| a | | |
| +---+ |
| b | | |
| +---+ |
+---------+
Since the object's Foo subobject is not at the beginning of the object, any Foo* initialized from a pointer to a Bar object will need to be adjusted by the size of a pointer so that it actually points at the Bar object's Foo subobject.
Live Demo
class B {
int a = 0;
};
class D : public B {
virtual ~D() = default;
};
D has a virtual member. B does not. A common implementation of dynamic dispatch in C++ involves keeping a hidden pointer to a table of function addresses at the begining of the object.
This means the first byte of the B sub-object won't be at the start of the complete D object. A pointer cast would need to adjust the address by the vptr size.
Related
I read this question: C++ Virtual class inheritance object size issue, and was wondering why virtual inheritance results in an additional vtable pointer in the class.
I found an article here: https://en.wikipedia.org/wiki/Virtual_inheritance
which tells us:
However this offset can in the general case only be known at runtime,...
I don't get what is runtime-related here. The complete class inheritance hierarchy is already known at compile time. I understand virtual functions and the use of a base pointer, but there is no such thing with virtual inheritance.
Can someone explain why some compilers (Clang/GCC) implement virtual inheritance with a vtable and how this is used during runtime?
BTW, I also saw this question: vtable in case of virtual inheritance, but it only points to answers related to virtual functions, which is not my question.
The complete class inheritance hierarchy is already known in compile time.
True enough; so if the compiler knows the type of a most derived object, then it knows the offset of every subobject within that object. For such a purpose, a vtable is not needed.
For example, if B and C both virtually derive from A, and D derives from both B and C, then in the following code:
D d;
A* a = &d;
the conversion from D* to A* is, at most, adding a static offset to the address.
However, now consider this situation:
A* f(B* b) { return b; }
A* g(C* c) { return c; }
Here, f must be able to accept a pointer to any B object, including a B object that may be a subobject of a D object or of some other most derived class object. When compiling f, the compiler doesn't know the full set of derived classes of B.
If the B object is a most derived object, then the A subobject will be located at a certain offset. But what if the B object is part of a D object? The D object only contains one A object and it can't be located at its usual offsets from both the B and C subobjects. So the compiler has to pick a location for the A subobject of D, and then it has to provide a mechanism so that some code with a B* or C* can find out where the A subobject is. This depends solely on the inheritance hierarchy of the most derived type---so a vptr/vtable is an appropriate mechanism.
However this offset can in the general case only be known at runtime,...
I can't get the point, what is runtime related here. The complete class inheritance hierarchy is already known in compile time.
The linked article at Wikipedia provides a good explanation with examples, I think.
The example code from that article:
struct Animal {
virtual ~Animal() = default;
virtual void Eat() {}
};
// Two classes virtually inheriting Animal:
struct Mammal : virtual Animal {
virtual void Breathe() {}
};
struct WingedAnimal : virtual Animal {
virtual void Flap() {}
};
// A bat is still a winged mammal
struct Bat : Mammal, WingedAnimal {
};
When you careate an object of type Bat, there are various ways a compiler may choose the object layout.
Option 1
+--------------+
| Animal |
+--------------+
| vpointer |
| Mammal |
+--------------+
| vpointer |
| WingedAnimal |
+--------------+
| vpointer |
| Bat |
+--------------+
Option 2
+--------------+
| vpointer |
| Mammal |
+--------------+
| vpointer |
| WingedAnimal |
+--------------+
| vpointer |
| Bat |
+--------------+
| Animal |
+--------------+
The values contained in vpointer in Mammal and WingedAnimal define the offsets to the Animal sub-object. Those values cannot be known until run time because the constructor of Mammal cannot know whether the subject is Bat or some other object. If the sub-object is Monkey, it won't derive from WingedAnimal. It will be just
struct Monkey : Mammal {
};
in which case, the object layout could be:
+--------------+
| vpointer |
| Mammal |
+--------------+
| vpointer |
| Monkey |
+--------------+
| Animal |
+--------------+
As can be seen, the offset from the Mammal sub-object to the Animal sub-object is defined by the classes derived from Mammal. Hence, it can be defined only at runtime.
The complete class inheritance hierarchy is already known at compiler time. But all the vptr related operations, such as to get the offsets to virtual base class and issue the virtual function call, are delayed until runtime, because only at runtime can we know the actual type of the object.
For example,
class A() { virtual bool a() { return false; } };
class B() : public virtual A { int a() { return 0; } };
B* ptr = new B();
// assuming function a()'s index is 2 at virtual function table
// the call
ptr->a();
// will be transformed by the compiler to (*ptr->vptr[2])(ptr)
// so a right call to a() will be issued according to the type of the object ptr points to
Consider the following code:
class A1
{
virtual void a() = 0;
};
class A2
{
virtual int a(int x) = 0;
};
class B : public A1, public A2
{
void a() {}
int a(int x) { return x; }
};
int main()
{
A1* pa1;
pa1 = new B;
delete pa1;
A2* pa2;
pa2 = new B;
delete pa2;
return 0;
}
Classes A1 and A2 are just pure abstract, so multiple inheritance should do no harm. Now, the above code will cause a crash during destructor call, but what is peculiar, only for one object: pa2. The fix to this problem seems quite obvious - use virtual destructors ~A1() and ~A2(). However, there are still two questions:
Why the virtual destructors are necessary, since we do not have any data in any of these classes?
Why is the behavior different for pa1 and pa2? I have found that this is related to the order in which classes are placed on the parent list. If you changed it to:
class B : public A2, public A1
then
delete pa1;
would cause crash.
A possible and typical memory layout:
+-A1---+
| vptr |
+------+
+-A2---+
| vptr |
+------+
+-B------------------+
| +-A1---+ +-A2---+ |
| | vptr | | vptr | |
| +------+ +------+ |
+--------------------+
vptr is a pointer that points to some information about the most-derived type, e.g. the virtual function table, RTTI etc. (see e.g. the Itanium C++ ABI vtable layout)
So, when you write A2* p = new B, you'll end up with:
+-B------------------+
| +-A1---+ +-A2---+ |
| | vptr | | vptr | |
| +------+ +------+ |
+-----------^--------+
^ | p
| new B
When you now delete p;, this can cause trouble in the free store deallocator, since the address stored in p is not the same as the address you've received from the allocator (new B). This won't happen if you cast to A1, i.e. A1* p = new B, since there's no offset in this case.
Live example
You can avoid try to avoid this particular problem by restoring the original pointer via a dynamic_cast:
delete dynamic_cast<void*>(p);
Live example
But do not rely on this. It is still Undefined Behaviour (see Barry's answer).
From [expr.delete]:
In the first alternative (delete object), if the static type of the object to be deleted is different from its
dynamic type, the static type shall be a base class of the dynamic type of the object to be deleted and the
static type shall have a virtual destructor or the behavior is undefined.
Undefined behavior is undefined. The virtual destructor is necessary because the standard says so (see also dyp's answer)
Compiling with warnings also helps:
main.cpp: In function 'int main()':
main.cpp:22:12: warning: deleting object of abstract class type 'A1' which has non-virtual destructor will cause undefined behaviour [-Wdelete-non-virtual-dtor]
delete pa1;
^
main.cpp:26:12: warning: deleting object of abstract class type 'A2' which has non-virtual destructor will cause undefined behaviour [-Wdelete-non-virtual-dtor]
delete pa2;
^
The order is sort of relevant because the order of destructors is opposite of the declaration order. However, it is actually "lucky" that it even works for pa1, since deleting objects of abstact class type with non-virtual destructor causes undefined behaviour. One always needs to add a virtual destructor for abstract classes.
I have the following class hierarchy:
class IControl
{
virtual void SomeMethod() = 0; // Just to make IControl polymorphic.
};
class ControlBase
{
public:
virtual int GetType() = 0;
};
class ControlImpl : public ControlBase, public IControl
{
public:
virtual void SomeMethod() { }
virtual int GetType()
{
return 1;
}
};
I have an IControl abstract class, and a ControlBase class. The ControlBase class does not inherit from IControl, but I know that every IControl-implementation will derive from ControlBase.
I have the following test code in which I cast an IControl-reference to ControlBase (because I know it derives from it) with dynamic_cast, and also with C-style cast:
int main()
{
ControlImpl stb;
IControl& control = stb;
ControlBase& testCB1 = dynamic_cast<ControlBase&>(control);
ControlBase& testCB2 = (ControlBase&)control;
ControlBase* testCB3 = (ControlBase*)&control;
std::cout << &testCB1 << std::endl;
std::cout << &testCB2 << std::endl;
std::cout << testCB3 << std::endl;
std::cout << std::endl;
std::cout << testCB1.GetType() << std::endl; // This properly prints "1".
std::cout << testCB2.GetType() << std::endl; // This prints some random number.
std::cout << testCB3->GetType() << std::endl; // This prints some random number.
}
Only the dynamic_cast works properly, the other two casts give back slightly different memory addresses, and the GetType() function gives back incorrect values.
What is the exact reason for this? Does the C-style cast end up using a reinterpret_cast? Is it related to how polymorphic objects are aligned in memory?
I think the class names in your example are a bit confusing. Let's call them Interface, Base and Impl. Note that Interface and Base are unrelated.
The C++ Standard defines the C-style cast, called "explicit type conversion (cast notation)" in [expr.cast]. You can (and maybe should) read that whole paragraph to know exactly how the C-style cast is defined. For the example in the OP, the following is sufficient:
A C-style can performs a conversion of one of [expr.cast]/4:
const_cast
static_cast
static_cast followed by const_cast
reinterpret_cast
reinterpret_cast followed by const_cast
The order of this list is important, because:
If a conversion can be interpreted in more than one of the ways listed above, the interpretation that appears first in the list is used, even if a cast resulting from that interpretation is ill-formed.
Let's examine your example
Impl impl;
Interface* pIntfc = &impl;
Base* pBase = (Base*)pIntfc;
A const_cast cannot be used, the next element in the list is a static_cast. But the classes Interface and Base are unrelated, therefore there is no static_cast that can convert from Interface* to Base*. Therefore, a reinterpret_cast is used.
Additional note: the actual answer to your question is: as there is no dynamic_cast in the list above, a C-style cast never behaves like a dynamic_cast.
How the actual address changes is not part of the definition of the C++ language, but we can make an example of how it could be implemented:
Each object of a class with at least one virtual function (inherited or own) contains (read: could contain, in this example) a pointer to a vtable. If it inherits virtual functions from multiple classes, it contains multiple pointers to vtables. Because of empty base class optimization (no data members), an instance of Impl could look like this:
+=Impl=======================================+
| |
| +-Base---------+ +-Interface---------+ |
| | vtable_Base* | | vtable_Interface* | |
| +--------------+ +-------------------+ |
| |
+============================================+
Now, the example:
Impl impl;
Impl* pImpl = &impl;
Interface* pIntfc = pImpl;
Base* pBase = pImpl;
+=Impl=======================================+
| |
| +-Base---------+ +-Interface---------+ |
| | vtable_Base* | | vtable_Interface* | |
| +--------------+ +-------------------+ |
| ^ ^ |
+==|==================|======================+
^ | |
| +-- pBase +-- pIntfc
|
+-- pimpl
If you instead do a reinterpret_cast, the result is implementation-defined, but it could result in something like this:
Impl impl;
Impl* pImpl = &impl;
Interface* pIntfc = pImpl;
Base* pBase = reinterpret_cast<Base*>(pIntfc);
+=Impl=======================================+
| |
| +-Base---------+ +-Interface---------+ |
| | vtable_Base* | | vtable_Interface* | |
| +--------------+ +-------------------+ |
| ^ |
+=====================|======================+
^ |
| +-- pIntfc
| |
+-- pimpl +-- pBase
I.e. the address is unchanged, pBase points to the Interface subobject of the Impl object.
Note that dereferencing the pointer pBase takes us to UB-land already, the Standard doesn't specify what should happen. In this exemplary implementation, if you call pBase->GetType(), the vtable_Interface* is used, which contains the SomeMethod entry, and that function is called. This function doesn't return anything, so in this example, nasal demons are summoned and take over the world. Or some value is taken from the stack as a return value.
What is the exact reason for this?
The exact reason is that dynamic_cast is guaranteed to work in this situation by the standard, while the other kinds invoke undefined behaviour.
Does the C-style cast end up using a reinterpret_cast?
Yes, in this case it does. (A side note: never ever use a C-style cast).
Is it related to how polymorphic objects are aligned in memory?
I would say it is related to the way polymorphic objects that use multiple inheritance are laid out in memory. In a language with single inheritance, dynamic_cast would not be necessary, as the base subobject address would coincide with the derived object address. In the multiple-inheritance case this is not so, as there are more than one base subobjects, and different base subobjects must have different addresses.
Sometimes the compiler can calculate the offset between each subobjects address and the derived object address. If the offset is non-zero, the cast operation is then becomes a pointer addition or subtraction instead of a no-op. (In the case of virtual inheritance upcast, it's somewhat more complicated but the compiler can still do that).
There is at least two cases when the compiler cannot do that:
Cross-cast (that is, between two classes neither of which is a base class of the other).
Downcast from a virtual base.
In these cases dynamic_cast is the only way to cast.
I need to derive a child class CDerived from two different base classes CBaseA and CBaseB.
In addition, I need to call virtual functions of both parents on the derived class. Since I want to manage differently typed objects in one single vector later (this is not part of this minimal code expample), I need to call the virtual functions from a base class pointer to the derived class object:
#include <iostream>
#include <stdlib.h>
class CBaseA
{
public:
virtual void FuncA(){ std::cout << "CBaseA::FuncA()" << std::endl; };
};
class CBaseB
{
public:
virtual void FuncB(){ std::cout << "CBaseB::FuncB()" << std::endl; };
};
class CDerived : public CBaseB, public CBaseA
{};
int main( int argc, char* argv[] )
{
// An object of the derived type:
CDerived oDerived;
// A base class pointer to the object, as it could later
// be stored in a general vector:
CBaseA* pAHandle = reinterpret_cast<CBaseA*>( &oDerived );
// Calling method A:
pAHandle->FuncA();
return 0;
}
Problem: But when running this on my computer, FuncB() is called instead of FuncA(). I get the right result, if I "flip" the parent class deklarations around, i.e.
class CDerived : public CBaseA, public CBaseB
but this doesn't solve my problem, since I cannot be sure which function will be called.
So my question is: What am I doing wrong and what is the correct way of handling such a problem?
(I am using g++ 4.6.2, by the way)
CBaseA* pAHandle = reinterpret_cast<CBaseA*>( &oDerived );
Do not use reinterpret_cast for performing a conversion to a base class. No cast is required; the conversion is implicit:
CBaseA* pAHandle = &oDerived;
For converting to a derived class, use static_cast if the object is known to be of the target type or dynamic_cast if it is not.
Your use of reinterpret_cast yields undefined behavior, hence the "odd" behavior that you see. There are few correct uses of reinterpret_cast and none of them involve conversions within a class hierarchy.
Common implementation which may help you to understand what happens.
CBaseA in memory look like this
+---------+
| __vptrA |
+---------+
CBaseB in memory looks like this
+---------+
| __vptrB |
+---------+
CDerived looks like this:
+---------+
&oDerived-> | __vptrB |
| __vptrA |
+---------+
If you simply assign &oDerived to a CBaseA*, the compiler puts code to add the offset so that you have
+---------+
&oDerived--->| __vptrB |
pAHandle---->| __vptrA |
+---------+
an during execution the program find pointers to A virtual function in __vptrA. If you static_cast or dynamic_cast pAHandle back to a CDerived (or even dynamic_cast pAHandle to a CBaseA), the compiler will put code to subtract the offset so that the result point to the start of the object (dynamic_cast will find the information about how much to substract in the vtable along with the pointers to virtual functions).
When you reinterpret_casted &oDerived as a CBaseA*, the compiler don't put such code to adjust the pointer, you get
+---------+
pAHandle, &oDerived--->| __vptrB |
| __vptrA |
+---------+
and during the execution, the program looked at __vptrB for A virtual function, finding instead B virtual functions.
I have some questions about the object size with virtual.
1) virtual function
class A {
public:
int a;
virtual void v();
}
The size of class A is 8bytes....one integer(4 bytes) plus one virtual pointer(4 bytes)
It's clear!
class B: public A{
public:
int b;
virtual void w();
}
What's the size of class B? I tested using sizeof B, it prints
12
Does it mean that only one vptr is there even both of class B and class A have virtual function? Why there is only one vptr?
class A {
public:
int a;
virtual void v();
};
class B {
public:
int b;
virtual void w();
};
class C : public A, public B {
public:
int c;
virtual void x();
};
The sizeof C is 20........
It seems that in this case, two vptrs are in the layout.....How does this happen? I think the two vptrs one is for class A and another is for class B....so there is no vptr for the virtual function of class C?
My question is, what's the rule about the number of vptrs in inheritance?
2) virtual inheritance
class A {
public:
int a;
virtual void v();
};
class B: virtual public A{ //virtual inheritance
public:
int b;
virtual void w();
};
class C : public A { //non-virtual inheritance
public:
int c;
virtual void x();
};
class D: public B, public C {
public:
int d;
virtual void y();
};
The sizeof A is 8 bytes -------------- 4(int a) + 4 (vptr) = 8
The sizeof B is 16 bytes -------------- Without virtual it should be 4 + 4 + 4 = 12. why there is another 4 bytes here? What's the layout of class B ?
The sizeof C is 12 bytes. -------------- 4 + 4 + 4 = 12. It's clear!
The sizeof D is 32 bytes -------------- it should be 16(class B) + 12(class C) + 4(int d) = 32. Is that right?
class A {
public:
int a;
virtual void v();
};
class B: virtual public A{ //virtual inheritance here
public:
int b;
virtual void w();
};
class C : virtual public A { //virtual inheritance here
public:
int c;
virtual void x();
};
class D: public B, public C {
public:
int d;
virtual void y();
};
sizeof A is 8
sizeof B is 16
sizeof C is 16
sizeof D is 28 Does it mean 28 = 16(class B) + 16(class C) - 8(class A) + 4 ( what's this? )
My question is , why there is an extra space when virtual inheritance is applied?
What's the underneath rule for the object size in this case?
What's the difference when virtual is applied on all the base classes and on part of the base classes?
This is all implementation defined. I'm using VC10 Beta2. The key to help understanding this stuff (the implementation of virtual functions), you need to know about a secret switch in the Visual Studio compiler, /d1reportSingleClassLayoutXXX. I'll get to that in a second.
The basic rule is the vtable needs to be located at offset 0 for any pointer to an object. This implies multiple vtables for multiple inheritance.
Couple questions here, I'll start at the top:
Does it mean that only one vptr is there even both of class B and class A have virtual function? Why there is only one vptr?
This is how virtual functions work, you want the base class and derived class to share the same vtable pointer (pointing to the implementation in the derived class.
It seems that in this case, two vptrs are in the layout.....How does this happen? I think the two vptrs one is for class A and another is for class B....so there is no vptr for the virtual function of class C?
This is the layout of class C, as reported by /d1reportSingleClassLayoutC:
class C size(20):
+---
| +--- (base class A)
0 | | {vfptr}
4 | | a
| +---
| +--- (base class B)
8 | | {vfptr}
12 | | b
| +---
16 | c
+---
You are correct, there are two vtables, one for each base class. This is how it works in multiple inheritance; if the C* is casted to a B*, the pointer value gets adjusted by 8 bytes. A vtable still needs to be at offset 0 for virtual function calls to work.
The vtable in the above layout for class A is treated as class C's vtable (when called through a C*).
The sizeof B is 16 bytes -------------- Without virtual it should be 4 + 4 + 4 = 12. why there is another 4 bytes here? What's the layout of class B ?
This is the layout of class B in this example:
class B size(20):
+---
0 | {vfptr}
4 | {vbptr}
8 | b
+---
+--- (virtual base A)
12 | {vfptr}
16 | a
+---
As you can see, there is an extra pointer to handle virtual inheritance. Virtual inheritance is complicated.
The sizeof D is 32 bytes -------------- it should be 16(class B) + 12(class C) + 4(int d) = 32. Is that right?
No, 36 bytes. Same deal with the virtual inheritance. Layout of D in this example:
class D size(36):
+---
| +--- (base class B)
0 | | {vfptr}
4 | | {vbptr}
8 | | b
| +---
| +--- (base class C)
| | +--- (base class A)
12 | | | {vfptr}
16 | | | a
| | +---
20 | | c
| +---
24 | d
+---
+--- (virtual base A)
28 | {vfptr}
32 | a
+---
My question is , why there is an extra space when virtual inheritance is applied?
Virtual base class pointer, it's complicated. Base classes are "combined" in virtual inheritance. Instead of having a base class embedded into a class, the class will have a pointer to the base class object in the layout. If you have two base classes using virtual inheritance (the "diamond" class hierarchy), they will both point to the same virtual base class in the object, instead of having a separate copy of that base class.
What's the underneath rule for the object size in this case?
Important point; there are no rules: the compiler can do whatever it needs to do.
And a final detail; to make all these class layout diagrams I am compiling with:
cl test.cpp /d1reportSingleClassLayoutXXX
Where XXX is a substring match of the structs/classes you want to see the layout of. Using this you can explore the affects of various inheritance schemes yourself, as well as why/where padding is added, etc.
Quote> My question is, what's the rule about the number of vptrs in inheritance?
There are no rulez, every compiler vendor is allowed to implement the semantics of inheritance the way he sees fit.
class B: public A {}, size = 12. That's pretty normal, one vtable for B that has both virtual methods, vtable pointer + 2*int = 12
class C : public A, public B {}, size = 20. C can arbitrarily extend the vtable of either A or B. 2*vtable pointer + 3*int = 20
Virtual inheritance: that's where you really hit the edges of undocumented behavior. For example, in MSVC the #pragma vtordisp and /vd compile options become relevant. There's some background info in this article. I studied this a few times and decided the compile option acronym was representative for what could happen to my code if I ever used it.
A good way to think about it is to understand what has to be done to handle up-casts. I'll try to answer your questions by showing the memory layout of objects of the classes you describe.
Code sample #2
The memory layout is as follows:
vptr | A::a | B::b
Upcasting a pointer to B to type A will result in the same address, with the same vptr being used. This is why there's no need for additional vptr's here.
Code sample #3
vptr | A::a | vptr | B::b | C::c
As you can see, there are two vptr's here, just like you guessed. Why? Because it's true that if we upcast from C to A we don't need to modify the address, and thus can use the same vptr. But if we upcast from C to B we do need that modification, and correspondingly we need a vptr at the start of the resulting object.
So, any inherited class beyond the first will require an additional vptr (unless that inherited class has no virtual methods, in which case it has no vptr).
Code sample #4 and beyond
When you derive virtually, you need a new pointer, called a base pointer, to point to the location in the memory layout of the derived classes. There can be more than one base pointer, of course.
So how does the memory layout look? That depends on the compiler. In your compiler it's probably something like
vptr | base pointer | B::b | vptr | A::a | C::c | vptr | A::a
\-----------------------------------------^
But other compilers may incorporate base pointers in the virtual table (by using offsets - that deserves another question).
You need a base pointer because when you derive in a virtual fashion, the derived class will appear only once in the memory layout (it may appear additional times if it's also derived normally, as in your example), so all its children must point to the exact same location.
EDIT: clarification - it all really depends on the compiler, the memory layout I showed can be different in different compilers.
All of this is completely implementation defined you realize. You can't count on any of it. There is no 'rule'.
In the inheritance example, here is how the virtual table for classes A and B might look:
class A
+-----------------+
| pointer to A::v |
+-----------------+
class B
+-----------------+
| pointer to A::v |
+-----------------+
| pointer to B::w |
+-----------------+
As you can see, if you have a pointer to class B's virtual table, it is also perfectly valid as class A's virtual table.
In your class C example, if you think about it, there is no way to make a virtual table that is both valid as a table for class C, class A, and class B. So the compiler makes two. One virtual table is valid for class A and C (mostly likely) and the other is valid for class A and B.
This obviously depends on the compiler implementation.
Anyway I think that I can sum up the following rules from the implementation given by a classic paper linked below and which gives the number of bytes you get in your examples (except for class D which would be 36 bytes and not 32!!!):
The size of an object of class T is:
The size of its fields PLUS the sum of the size of every object from which T inherits PLUS 4 bytes for every object from which T virtually inherits PLUS 4 bytes ONLY IF T needs ANOTHER v-table
Pay attention: if a class K is virtually inherited multiple times (at any level) you have to add the size of K only once
So we have to answer another question: When does a class need ANOTHER v-table?
A class that does not inherit from other classes needs a v-table only if it has one or more virtual methods
OTHERWISE, a class needs another v-table ONLY IF NONE of the classes from which it non virtually inherits does have a v-table
The End of the rules (which I think can be applied to match what Terry Mahaffey has explained in his answer) :)
Anyway my suggestion is to read the following paper by Bjarne Stroustrup (the creator of C++) which explains exactly these things: how many virtual tables are needed with virtual or non virtual inheritance... and why!
It's really a good reading:
http://www.hpc.unimelb.edu.au/nec/g1af05e/chap5.html
I am not sure but I think that it is because of pointer to Virtual method table