Multiple Inheritance : size of class for virtual pointers? - c++

Given the code:
class A{};
class B : public virtual A{};
class C : public virtual A{};
class D : public B,public C{};
int main(){
cout<<"sizeof(D)"<<sizeof(D);
return 0;
}
Output:
sizeof(D) 8
Every class contains its own virtual pointer only not of any of its base class,
So, why the Size of class(D) is 8?

It depends on compiler implementation. My compiler is Visual Stdio C++ 2005.
Code like this:
int main(){
cout<<"sizeof(B):"<<sizeof(B) << endl;
cout<<"sizeof(C):"<<sizeof(C) << endl;
cout<<"sizeof(D):"<<sizeof(D) << endl;
return 0;
}
It will output
sizeof(B):4
sizeof(C):4
sizeof(D):8
class B has only one virtual pointer. So sizeof(B)=4. And class C is also.
But D multiple inheritance the class B and class C. The compile don't merge the two virtual table.So class D has two virtual pointer point to each virtual table.
If D only inheritance one class and not virtual inheritance. It will merge they virtual table.

It depends on compiler implementation so you should specify which compiler you're using. Anyway D derives from two classes so it contains pointers to B and C vtables base class pointers (I don't know a good name for this).
To test this you may declare a pointer to B and a pointer to C and cast the address of D to base class pointer. Dump that values and you'll see they're different!
EDIT
Test made with Visual C++ 10.0, 32 bit.
class Base
{
};
class Derived1 : public virtual Base
{
};
class Derived2 : public virtual Base
{
};
class Derived3 : public virtual Base
{
};
class ReallyDerived1 : public Derived1, public Derived2, public Derived3
{
};
class ReallyDerived2 : public Derived1, public Derived2
{
};
class ReallyDerived3 : public Derived2
{
};
void _tmain(int argc, _TCHAR* argv[])
{
std::cout << "Base: " << sizeof(Base) << std::endl;
std::cout << "Derived1: " << sizeof(Derived1) << std::endl;
std::cout << "ReallyDerived1: " << sizeof(ReallyDerived1) << std::endl;
std::cout << "ReallyDerived2: " << sizeof(ReallyDerived2) << std::endl;
std::cout << "ReallyDerived3: " << sizeof(ReallyDerived3) << std::endl;
}
Output, guess, is not surprising:
Base: 1 byte (OK, this is a surprise, at least for me).
Derived1: 4 bytes
ReallyDerived1: 12 bytes (4 bytes per base class because of multiple inheritance)
ReallyDerived2: 8 bytes (as guessed)
ReallyDerived3: 4 bytes (just one base class with virtual inheritance in the path but this is non virtual).
Adding a virtual method to the base you get 4 bytes more for each class. So probably extra bytes aren't vtable pointers but base class pointers used in multiple inheritance, this behavior does not change removing virtual inheritance (but if not virtual the size doesn't change adding more bases).

First: without virtual functions, it's probable that there isn't a
vptr at all in the classes. The 8 bytes you're seeing are an artifact
of the way virtual inheritance is implemented.
It's often possible for several classes in a hierarchy to share the same
vptr. For this to occur, it is necessary for their offset in the
final class to be the same, and for the list of vtable entries in the
base class to be an initial sequence the list of vtable entries in the
derived class.
Both conditions are met in almost all implementations for single
inheritance. No matter how deep the inheritance, there will usually be
only one vptr, shared between all of the classes.
In the case of multiple inheritance, there will always be at least one
class for which these requirements aren't met, since the two base
classes can't have a common start address, and unless they have exactly
the same virtual functions, only one's vtable could possibly be an
initial sequence of the other.
Virtual inheritance adds another quirk, since the position of the
virtual base relative to the class inheriting from it will vary
depending on the rest of the hierarchy. Most implementations I've seen
use a separate pointer for this, although it should be possible to put
this information in the vtable as well.
If we take your hierarchy, adding virtual functions so that we are
certain of having a vptr, we notice that B and D can still share a
vtable, but both A and C need separate vtables. This means that
if your classes had virtual functions, you would need at least three
vptr. (From this I conclude that your implementation is using
separate pointers to the virtual base. With B and D sharing the
same pointer, and C with its own pointer. And of course, A doesn't
have a virtual base, and doesn't need a pointer to itself.)
If you're trying to analyse exactly what is going on, I'd suggest adding
a new virtual function in each class, and adding a pointer sized
integral type that you initial with a different known value for each
class. (Use constructors to set the value.) Then create an instance of
the class, take it's address, then output the address for each base
class. And then dump the class: the known fixed values will help in
identifying where the different elements lie. Something like:
struct VB
{
int vb;
VB() : vb( 1 ) {}
virtual ~VB() {}
virtual void fvb() {}
};
struct Left : virtual VB
{
int left;
Left() : left( 2 ) {}
virtual ~Left() {}
virtual void fvb() {}
virtual void fleft() {}
};
struct Right : virtual VB
{
int right;
Right() : right( 3 ) {}
virtual ~Right() {}
virtual void fvb() {}
virtual void fright() {}
};
struct Derived : Left, Right
{
int derived;
Derived() : derived( 5 ) {}
virtual ~Derived() {}
virtual void fvb() {}
virtual void fleft() {}
virtual void fright() {}
virtual void fderived() {}
};
You might want to add a Derived2, which derives from Derived and see
what happens to the relative addresses between e.g. Left and VB
depending on whether the object has type Derived or Derived2.

You are making far too many assumptions. This is highly dependent on the ABI, so you should look into the documentation for your platform (my guess is that you are running on a 32bit platform).
The first thing is that there are no virtual functions in your example, and that means that none of the types actually contains a pointer to the virtual table. So where did those 2 pointers come from? (I am assuming you are on a 32bit architecture). Well, virtual inheritance is the answer. When you inherit virtually, the relative location of the virtual base (A) with respect to the extra elements in the derived type (B,C) will change along the inheritance chain. In the case of a B or C object the compiler can lay the types as [A,B'] and [A,C'] (where X' is the extra fields of X not present in A).
Now virtual inheritance means that there will only be one A subobject in the case of D, so the compiler can layout the D type as [A,B',C',D] or [A,C',B',D] (or any other combination, A might be at the end of the object, etc, this is defined in the ABI). So what does this imply, this implies that member functions of B and C cannot assume where the A subobject might be (in the event of non-virtual inheritance, the relative location is known), because the complete type might actually be some other type down the chain.
The solution to the problem is that both B and C usually contain an extra pointer-to-base pointer, similar but not equivalent to the virtual pointer. In the same way that the vptr is used to dynamically dispatch to a function, this extra pointer is used to dynamically find the base.
If you are interested in all this details, I recommend that you read the Itanium ABI, which is widely used not only in Itanium but also in other Intel 64 architectures (and a modified version in 32 architectures) by different compilers.

Related

Itanium C++ ABI primary virtual bases

I was reading here about how primary bases are chosen:
"...2. If C is a dynamic class type:
a. Identify all virtual base classes, direct or indirect, that are primary base classes for some other direct or indirect base class. Call these indirect primary base classes.
b. If C has a dynamic base class, attempt to choose a primary base class B. It is the first (in direct base class order) non-virtual dynamic base class, if one exists. Otherwise, it is a nearly empty virtual base class, the first one in (preorder) inheritance graph order which is not an indirect primary base class if any exist, or just the first one if they are all indirect primaries..."
And after there is this correction:
"Case (2b) above is now considered to be an error in the design. The use of the first indirect primary base class as the derived class' primary base does not save any space in the object, and will cause some duplication of virtual function pointers in the additional copy of the base classes virtual table.
The benefit is that using the derived class virtual pointer as the base class virtual pointer will often save a load, and no adjustment to the this pointer will be required for calls to its virtual functions.
It was thought that 2b would allow the compiler to avoid adjusting this in some cases, but this was incorrect, as the virtual function call algorithm requires that the function be looked up through a pointer to a class that defines the function, not one that just inherits it. Removing that requirement would not be a good idea, as there would then no longer be a way to emit all thunks with the functions they jump to. For instance, consider this example:
struct A { virtual void f(); };
struct B : virtual public A { int i; };
struct C : virtual public A { int j; };
struct D : public B, public C {};
When B and C are declared, A is a primary base in each case, so although vcall offsets are allocated in the A-in-B and A-in-C vtables, no this adjustment is required and no thunk is generated. However, inside D objects, A is no longer a primary base of C, so if we allowed calls to C::f() to use the copy of A's vtable in the C subobject, we would need to adjust this from C* to B::A*, which would require a third-party thunk. Since we require that a call to C::f() first convert to A*, C-in-D's copy of A's vtable is never referenced, so this is not necessary."
Could you please explain with an example what this refers to: "Removing that requirement would not be a good idea, as there would then no longer be a way to emit all thunks with the functions they jump to"?
Also, what are third-party thunks?
I do not understand either what the quoted example tries to show.
A is a nearly empty class, one that contains only a vptr and no visible data members:
struct A { virtual void f(); };
The layout of A is:
A_vtable *vptr
B has a single nearly empty base class used as a "primary":
struct B : virtual public A { int i; };
It means that the layout of B begins with the layout of an A, so that a pointer to a B is a pointer to an A (in assembly language). Layout of B subobject:
B_vtable *A_vptr
int i
A_vptr will point to a B vtable obviously, which is binary compatible with A vtable.
The B_vtable extends the A_vtable, adding all necessary information to navigate to the virtual base class A.
Layout of B complete object:
A base_subobject
int i
And same for C:
C_vtable *A_vptr
int j
Layout of C complete object:
A base_subobject
int j
In D obviously there is only an A subobject, so the layout of a complete object is:
A base_subobject
int i
not(A) not(base_subobject) aka (C::A)_vptr
int j
not(A) is the representation of an A nearly empty base class, that is, a vptr for A, but not a true A subobject: it looks like an A but the visible A is two words above. It's a ghost A!
(C::A)_vptr is the vptr to vtable with layout vtable for C (so also one with layout vtable for A), but for a C subobject where A is not finally a primary base: the C subobject has lost the privilege to host the A base class. So obviously the virtual calls through (C::A)_vptr to virtual functions defined A (there is only one: A::f()) need a this ajustement, with a thunk "C::A::f()" that receives a pointer to not(base_subobject) and adjusts it to the real base_subobject of type A that is the (two words above in the example). (Or if there is an overrider in D, so the D object that is at the exact same address, two words above in the example.)
So given these definitions:
struct A { virtual void f(); };
struct B : virtual public A { int i; };
struct C : virtual public A { int j; };
struct D : public B, public C {};
should the use of a ghost lvalue of a non existant A primary base work?
D d;
C *volatile cp = &d;
A *volatile ghost_ap = reinterpret_cast<A*> (cp);
ghost_ap->f(); // use the vptr of C::A: safe?
(volatile used here to avoid propagation of type knowledge by the compiler)
If for lvalues of C, for a virtual functions that is inherited from A, the call is done via the the C vptr that is also a C::A vptr because A is a "static" primary base of C, then the code should work, because a thunk has been generated that goes from C to D.
In practice it doesn't seem to work with GCC, but if you add an overrider in C:
struct C : virtual public A {
int j;
virtual void f()
{
std::cout << "C:f() \n";
}
};
it works because such concrete function is in the vtable of vtable of C::A.
Even with just a pure virtual overrider:
struct C : virtual public A {
int j;
virtual void f() = 0;
};
and a concrete overrider in D, it also works: the pure virtual override is enough to have the proper entry in the vtable of C::A.
Test code: http://codepad.org/AzmN2Xeh

where is the overridden virtual method saved in the vtable c++ in multiple inheritance

In C++, there is no class representation at run-time but I can always call an overridden virtual method in the derived class. where is that overridden method saved in the vtable? here's a piece of code to demonstrate:
struct B1 {
virtual void f() { ... }
};
struct B2 {
virtual void f() { ... }
virtual void g() { ... }
};
struct D : B1, B2 {
void f() { ... }
virtual void h() { ... }
};
What's the memory layout for an object of class D ? Where are B1::f and B2::f saved in that memory layout (if they're saved at all) ?
An object d of Class D will have only a pointer to the VMT of class D, which will contain a pointer to D::f.
Since B1:f and B2::f can be called only statically from the scope of D class, there is no need for object d to keep a dynamic pointer to those overridden methods.
This of cause is not defined in the standard, this is just the usual/logical implementation of the compiler.
In fact the picture is more complicated, since the VMT of class D incorporates the VMTs of classes B1 and B2. But anyway, there is no need to dynamically call B1::f until an object of class B1 is created.
When compiler uses vtable method of virtual dispatch*, the address of the overriden member function is stored in the vtable of the base class in which the function is defined.
Each class has access to vtables of all of its base classes. These vtables are stored outside of the memory layout for the class itself. Each class with virtual member functions, declared or inherited, has a single pointer to its own vtable. When you call an overriden member function, you supply the name of the base class whose member function you wish to call. The compiler knows about vtables of all classes, to it knows how to locate the vtable of your base class, does the lookup at compile time, and calls the member function directly.
Here is a short example:
struct A {
virtual void foo() { cout << "A"; }
};
struct B : public A { }; // No overrides
struct C : public B {
virtual void foo() { cout << "C"; }
void bar() { B::foo(); }
};
Demo.
In the example above the compiler needs to look up B::foo, which is not defined in class B. The compiler consults its symbol table to find out that B::foo is implemented in A, and generates the call to A::foo inside C::bar.
* vtables is not the only method of implementing virtual dispatch. C++ standard does not require vtables to be used.
Although nothing is mandated in the C++ standard, every known C++ implementation uses the same approach: every class with at least a virtual function has a vptr (pointer to vtable).
You didn't mention virtual inheritance which is a different, more subtle inheritance relation; non-virtual inheritance is a simple exclusive relation between a base class subobject and a derived class. I will assume all inheritance relations are not virtual in this answer.
Here I assume we derive from classes with at least a virtual function.
In case of single inheritance, the vptr from the base class is reused. (Not reusing it just wastes space and run time.) The base class is called "primary base class".
In case of multiple inheritance, the layout of the derived class contains the layout of every base class, just like the layout of a struct in C contains the layout of every member. The layout of D is B1 then B2 (in any order actually, but the source code order is usually kept).
The first class is the primary base class: in D the vptr from B1 points to a complete vtable for D, the vtable with all the virtual functions of D. Each vptr from a non-primary base class points to a secondary vtable of D: a vtable with only the virtual functions from this secondary base class.
The constructor of D must initialize every vptr of the class instance to point to the appropriate vtable of D.

Why does virtual keyword increase the size of derived a class?

I have two classes - one base class and one derived from it :
class base {
int i ;
public :
virtual ~ base () { }
};
class derived : virtual public base { int j ; };
main()
{ cout << sizeof ( derived ) ; }
Here the answer is 16. But if I do instead a non-virtual public inheritance or make the base class non-polymorphic , then I get the answer as 12, i.e. if I do :
class base {
int i ;
public :
virtual ~ base () { }
};
class derived : public base { int j ; };
main()
{ cout << sizeof ( derived ) ; }
OR
class base {
int i ;
public :
~ base () { }
};
class derived : virtual public base { int j ; };
main()
{ cout << sizeof ( derived ) ; }
In both the cases answer is 12.
Can someone please explain why there is a difference in the size of the derived class in 1st and the other 2 cases ?
( I work on code::blocks 10.05, if someone really need this )
There are two separate things here that cause extra overhead.
Firstly, having virtual functions in the base class increases its size by a pointer size (4 bytes in this case), because it needs to store the pointer to the virtual method table:
normal inheritance with virtual functions:
0 4 8 12
| base |
| vfptr | i | j |
Secondly, in virtual inheritance extra information is needed in derived to be able to locate base. In normal inheritance the offset between derived and base is a compile time constant (0 for single inheritance). In virtual inheritance the offset can depend on the runtime type and actual type hierarchy of the object. Implementations may vary, but for example Visual C++ does it something like this:
virtual inheritance with virtual functions:
0 4 8 12 16
| base |
| xxx | j | vfptr | i |
Where xxx is a pointer to some type information record, that allows to determine the offset to base.
And of course it's possible to have virtual inheritance without virtual functions:
virtual inheritance without virtual functions:
0 4 8 12
| base |
| xxx | j | i |
If a class has any virtual function, objects of this class need to have a vptr, that is a pointer to the vtable, that is the virtual table from where the address of the correct virtual function can be found. The function called depends on the dynamic type of the object, that it is the most derived class the object is a base subobject of.
Because the derived class inherits virtually from a base class, the location of the base class relative to the derived class is not fixed, it depends on the dynamic type of the object too. With gcc a class with virtual base classes needs a vptr to locate the base classes (even if there is no virtual function).
Also, the base class contains a data member, which is located just after the base class vptr. Base class memory layout is: { vptr, int }
If a base class needs vptr, a class derived from it will need a vptr too, but often the "first" vptr of a base class subobject is reused (this base class with the reused vptr is called the primary base). However this is not possible in this case, because the derived class needs a vptr not only to determine how to call the virtual function, but also where the virtual base is. The derived class cannot locate its virtual base class without using the vptr; if the virtual base class was used as a primary base, the derived class would need to locate its primary base to read the vptr, and would need to read the vptr to locate its primary base.
So the derived cannot have a primary base, and it introduces its own vptr.
The layout of a base class subobject of type derived is thus: { vptr, int } with the vptr pointing to a vtable for derived, containing not only the address of virtual functions, but also the relative location of all its virtual base classes (here just base), represented as an offset.
The layout of a complete object of type derived is: { base class subobject of type derived, base }
So the minimum possible size of derived is (2 int + 2 vptr) or 4 words on common ptr = int = word architectures, or 16 bytes in this case. (And Visual C++ makes bigger objects (when virtual base classes are involved), I believe a derived would have one more pointer.)
So yes, virtual functions have a cost, and virtual inheritance has a cost. The memory cost of virtual inheritance in this case is one more pointer per object.
In designs with many virtual base classes, the memory cost per object might be proportional to the number of virtual base classes, or not; we would need to discuss specific class hierarchies to estimate the cost.
In designs without multiple inheritance or virtual base classes (or even virtual functions), you might have to emulate many things automatically done by the compiler for you, with a bunch of pointers, possibly pointers to functions, possibly offsets... this could get confusing and error prone.
The point of virtual inheritance is to allow sharing of base classes. Here's the problem:
struct base { int member; virtual void method() {} };
struct derived0 : base { int d0; };
struct derived1 : base { int d1; };
struct join : derived0, derived1 {};
join j;
j.method();
j.member;
(base *)j;
dynamic_cast<base *>(j);
The last 4 lines are all ambiguous. You have to explicitly whether you want the base inside the derived0, or the base inside derived1.
If you change the second and third line as follows, the problem goes away:
struct derived0 : virtual base { int d0; };
struct derived1 : virtual base { int d1; };
Your j object now only has one copy of base, not two, so the last 4 lines stop being ambiguous.
But think about how that has to be implemented. Normally, in a derived0, the d0 comes right after the m, and in a derived1, the d1 comes right after the m. But with virtual inheritance, they both share the same m, so you can't have both d0 and d1 come right after it. So you're going to need some form of extra indirection. That's where the extra pointer comes from.
If you want to know exactly what the layout is, it depends on your target platform and compiler. Just "gcc" isn't enough. But for many modern non-Windows targets, the answer is defined by the Itanium C++ ABI, which is documented at http://mentorembedded.github.com/cxx-abi/abi.html#vtable.
What's going on is the extra overhead used to mark a class as having virtual members or involving virtual inheritance. How much extra depends on the compiler.
A mark of caution: Making a class derive from a class for which the destructor is not virtual is usually asking for trouble. Big trouble.
Possibly extra 4 bytes are needed to mark class type at runtime.
For example:
class A {
virtual int f() { return 2; }
}
class B : virtual public A {
virtual int f() { return 3; }
}
int call_function( A *a) {
// here we don't know what a really is (A or B)
// because of this to call correct method
// we need some runtime knowledge of type and storage space to put it in (extra 4 bytes).
return a->f();
}
int main() {
B b;
A *a = (A*)&b;
cout << call_function(a);
}
The extra size is due to the vtable/vtable pointer that is "invisibly" added to your class in order to hold the member function pointer for a specific object of this class or it's descendant/ancestor.
If that isn't clear, you'll need to do much more reading about virtual inheritance in C++.

Ambiguity in multiple inheritance of interfaces in C++

I made a test code as following:
#include <iostream>
using namespace std;
#ifndef interface
#define interface struct
#endif
interface Base
{
virtual void funcBase() = 0;
};
interface Derived1 : public Base
{
virtual void funcDerived1() = 0;
};
interface Derived2 : public Base
{
virtual void funcDerived2() = 0;
};
interface DDerived : public Derived1, public Derived2
{
virtual void funcDDerived() = 0;
};
class Implementation : public DDerived
{
public:
void funcBase() { cout << "base" << endl; }
void funcDerived1() { cout << "derived1" << endl; }
void funcDerived2() { cout << "derived2" << endl; }
void funcDDerived() { cout << "dderived" << endl; }
};
int main()
{
DDerived *pObject = new Implementation;
pObject->funcBase();
return 0;
}
The reason I wrote this code is to test if the function funcBase() can be called in an instance of DDerived or not. My C++ complier (Visual Studio 2010) gave me a compile error message when I tried to compile this code. In my opinion, there is no problem in this code because it is certain that the function funcBase() will be implemented (thus overriden) in some derived class of the interface DDerived, because it is pure virtual. In other words, any pointer variable of type Implementation * should be associated with an instance of a class deriving Implentation and overriding the function funcBase().
My question is, why the compiler give me such an error message? Why the C++ syntax is defined like that; i.e., to treat this case as an error? How can I make the code runs? I want to allow multiple inheritance of interfaces. Of course, if I use "virtual public" or re-declare the function funcBase() in Implementation like
interface DDerived : public Derived1, public Derived2
{
virtual void funcBase() = 0;
virtual void funcDDerived() = 0;
};
then everything runs with no problem.
But I don't want to do that and looking for more convenient method, because virtual inheritance may degrade the performance, and re-declaration is so tedious to do if inheritance relations of classes are very complex. Is there any methods to enable multiple inheritance of interfaces in C++ other than using virtual inheritance?
As you've defined it, your object structure looks like this:
The important point here is that each instance of Implementation contains two entirely separate instances of Base. You're providing an override of Base::funcBase, but it doesn't know whether you're trying to override funcBase for the Base you inherited through Derived1, or the Base you inherited through Derived2.
Yes, the clean way to deal with this is virtual inheritance. This will change your structure so there's only one instance of Base:
This is almost undoubtedly what you really want. Yes, it got a reputation for performance problems in the days of primitive compilers and 25 MHz 486's and such. With a modern compiler and processor, you're unlikely to encounter a problem.
Another possibility would be some sort of template-based alternative, but that tends to pervade the rest of your code -- i.e., instead of passing a Base *, you write a template that will work with anything that provides functions A, B, and C, then pass (the equivalent of) Implementation as a template parameter.
The C++ language is designed in such a way that in your first approach without virtual inheritance there will be two parent copies of the method and it can't figure out which one to call.
Virtual inheritance is the C++ solution to inheriting the same function from multiple bases, so I would suggest just using that approach.
Alternately have you considered just not inheriting the same function from multiple bases? Do you really have a derived class that you need to be able to treat as Derived1 or Derived2 OR Base depending on the context?
In this case elaborating on a concrete problem rather than a contrived example may help provide a better design.
DDerived *pObject = new Implementation;
pObject->funcBase();
This creates a pointer of type DDerived to a Implementation. When you are using DDerived you really just have a pointer to an interface.
DDerived does not know about the implementation of funcBase because of the ambiguity of having funcBase being defined in both Derived1 and Derived2.
This has created a inheritance diamond which is what is really causing the problem.
http://en.wikipedia.org/wiki/Diamond_problem
I also had to check on the interface "keyword" you have in there
it's an ms-specific extension that's recognised by visual studio
I think C++ Standard 10.1.4 - 10.1.5 can help you to understand the problem in your code.
class L { public: int next; /∗ ... ∗/ };
class A : public L { /∗...∗/ };
class B : public L { /∗...∗/ };
class C : public A, public B { void f(); /∗ ... ∗/ };
10.1.4 A base class specifier that does not contain the keyword virtual,
specifies a non-virtual base class. A base class specifier that
contains the keyword virtual, specifies a virtual base class. For each
distinct occurrence of a non-virtual base class in the class lattice
of the most derived class, the most derived object (1.8) shall contain
a corresponding distinct base class subobject of that type. For each
distinct base class that is specified virtual, the most derived object
shall contain a single base class subobject of that type. [ Example:
for an object of class type C, each distinct occurrence of a
(non-virtual) base class L in the class lattice of C corresponds
one-to-one with a distinct L subobject within the object of type C.
Given the class C defined above, an object of class C will have two
subobjects of class L as shown below.
10.1.5 In such lattices, explicit qualification can be used to specify which
subobject is meant. The body of function C::f could refer to the
member next of each L subobject: void C::f() { A::next = B::next; } //
well-formed. Without the A:: or B:: qualifiers, the definition of C::f
above would be ill-formed because of ambiguity
So just add qualifiers when calling pObject->funcBase() or solve ambiguity in another way.
pObject->Derived1::funcBase();
Updated: Also very helpful reading will be 10.3 Virtual Functions of Standard.
Have a nice weekend :)

How can I simulate interfaces in C++?

Since C++ lacks the interface feature of Java and C#, what is the preferred way to simulate interfaces in C++ classes? My guess would be multiple inheritance of abstract classes.
What are the implications in terms of memory overhead/performance?
Are there any naming conventions for such simulated interfaces, such as SerializableInterface?
Since C++ has multiple inheritance unlike C# and Java, yes you can make a series of abstract classes.
As for convention, it is up to you; however, I like to precede the class names with an I.
class IStringNotifier
{
public:
virtual void sendMessage(std::string &strMessage) = 0;
virtual ~IStringNotifier() { }
};
The performance is nothing to worry about in terms of comparison between C# and Java. Basically you will just have the overhead of having a lookup table for your functions or a vtable just like any sort of inheritance with virtual methods would have given.
There's really no need to 'simulate' anything as it is not that C++ is missing anything that Java can do with interfaces.
From a C++ pointer of view, Java makes an "artificial" disctinction between an interface and a class. An interface is just a class all of whose methods are abstract and which cannot contain any data members.
Java makes this restriction as it does not allow unconstrained multiple inheritance, but it does allow a class to implement multiple interfaces.
In C++, a class is a class and an interface is a class. extends is achieved by public inheritance and implements is also achieved by public inheritance.
Inheriting from multiple non-interface classes can result in extra complications but can be useful in some situations. If you restrict yourself to only inheriting classes from at most one non-interface class and any number of completely abstract classes then you aren't going to encounter any other difficulties than you would have in Java (other C++ / Java differences excepted).
In terms of memory and overhead costs, if you are re-creating a Java style class hierarchy then you have probably already paid the virtual function cost on your classes in any case. Given that you are using different runtime environments anyway, there's not going to be any fundamental difference in overhead between the two in terms of cost of the different inheritance models.
"What are the implications in terms of memory overhead/performance?"
Usually none except those of using virtual calls at all, although nothing much is guaranteed by the standard in terms of performance.
On memory overhead, the "empty base class" optimization explicitly permits the compiler to layout structures such that adding a base class which has no data members does not increase the size of your objects. I think you're unlikely to have to deal with a compiler which does not do this, but I could be wrong.
Adding the first virtual member function to a class usually increases objects by the size of a pointer, compared with if they had no virtual member functions. Adding further virtual member functions makes no additional difference. Adding virtual base classes might make a further difference, but you don't need that for what you're talking about.
Adding multiple base classes with virtual member functions probably means that in effect you only get the empty base class optimisation once, because in a typical implementation the object will need multiple vtable pointers. So if you need multiple interfaces on each class, you may be adding to the size of the objects.
On performance, a virtual function call has a tiny bit more overhead than a non-virtual function call, and more importantly you can assume that it generally (always?) won't be inlined. Adding an empty base class doesn't usually add any code to construction or destruction, because the empty base constructor and destructor can be inlined into the derived class constructor/destructor code.
There are tricks you can use to avoid virtual functions if you want explicit interfaces, but you don't need dynamic polymorphism. However, if you're trying to emulate Java then I assume that's not the case.
Example code:
#include <iostream>
// A is an interface
struct A {
virtual ~A() {};
virtual int a(int) = 0;
};
// B is an interface
struct B {
virtual ~B() {};
virtual int b(int) = 0;
};
// C has no interfaces, but does have a virtual member function
struct C {
~C() {}
int c;
virtual int getc(int) { return c; }
};
// D has one interface
struct D : public A {
~D() {}
int d;
int a(int) { return d; }
};
// E has two interfaces
struct E : public A, public B{
~E() {}
int e;
int a(int) { return e; }
int b(int) { return e; }
};
int main() {
E e; D d; C c;
std::cout << "A : " << sizeof(A) << "\n";
std::cout << "B : " << sizeof(B) << "\n";
std::cout << "C : " << sizeof(C) << "\n";
std::cout << "D : " << sizeof(D) << "\n";
std::cout << "E : " << sizeof(E) << "\n";
}
Output (GCC on a 32bit platform):
A : 4
B : 4
C : 8
D : 8
E : 12
Interfaces in C++ are classes which have only pure virtual functions. E.g. :
class ISerializable
{
public:
virtual ~ISerializable() = 0;
virtual void serialize( stream& target ) = 0;
};
This is not a simulated interface, it is an interface like the ones in Java, but does not carry the drawbacks.
E.g. you can add methods and members without negative consequences :
class ISerializable
{
public:
virtual ~ISerializable() = 0;
virtual void serialize( stream& target ) = 0;
protected:
void serialize_atomic( int i, stream& t );
bool serialized;
};
To the naming conventions ... there are no real naming conventions defined in the C++ language. So choose the one in your environment.
The overhead is 1 static table and in derived classes which did not yet have virtual functions, a pointer to the static table.
In C++ we can go further than the plain behaviour-less interfaces of Java & co.
We can add explicit contracts (as in Design by Contract) with the NVI pattern.
struct Contract1 : noncopyable
{
virtual ~Contract1();
Res f(Param p) {
assert(f_precondition(p) && "C1::f precondition failed");
const Res r = do_f(p);
assert(f_postcondition(p,r) && "C1::f postcondition failed");
return r;
}
private:
virtual Res do_f(Param p) = 0;
};
struct Concrete : virtual Contract1, virtual Contract2
{
...
};
Interfaces in C++ can also occur statically, by documenting the requirements on template type parameters.
Templates pattern match syntax, so you don't have to specify up front that a particular type implements a particular interface, so long as it has the right members. This is in contrast to Java's <? extends Interface> or C#'s where T : IInterface style constraints, which require the substituted type to know about (I)Interface.
A great example of this is the Iterator family, which are implemented by, among other things, pointers.
If you don't use virtual inheritance, the overhead should be no worse than regular inheritance with at least one virtual function. Each abstract class inheritted from will add a pointer to each object.
However, if you do something like the Empty Base Class Optimization, you can minimize that:
struct A
{
void func1() = 0;
};
struct B: A
{
void func2() = 0;
};
struct C: B
{
int i;
};
The size of C will be two words.
By the way MSVC 2008 has __interface keyword.
A Visual C++ interface can be defined as follows:
- Can inherit from zero or more base
interfaces.
- Cannot inherit from a base class.
- Can only contain public, pure virtual
methods.
- Cannot contain constructors,
destructors, or operators.
- Cannot contain static methods.
- Cannot contain data members;
properties are allowed.
This feature is Microsoft Specific. Caution: __interface has no virtual destructor that is required if you delete objects by its interface pointers.
There is no good way to implement an interface the way you're asking. The problem with an approach such as as completely abstract ISerializable base class lies in the way that C++ implements multiple inheritance. Consider the following:
class Base
{
};
class ISerializable
{
public:
virtual string toSerial() = 0;
virtual void fromSerial(const string& s) = 0;
};
class Subclass : public Base, public ISerializable
{
};
void someFunc(fstream& out, const ISerializable& o)
{
out << o.toSerial();
}
Clearly the intent is for the function toSerial() to serialize all of the members of Subclass including those that it inherits from Base class. The problem is that there is no path from ISerializable to Base. You can see this graphically if you execute the following:
void fn(Base& b)
{
cout << (void*)&b << endl;
}
void fn(ISerializable& i)
{
cout << (void*)&i << endl;
}
void someFunc(Subclass& s)
{
fn(s);
fn(s);
}
The value output by the first call is not the same as the value output by the second call. Even though a reference to s is passed in both cases, the compiler adjusts the address passed to match the proper base class type.