Related
I got this question when I received a code review comment saying virtual functions need not be inline.
I thought inline virtual functions could come in handy in scenarios where functions are called on objects directly. But the counter-argument came to my mind is -- why would one want to define virtual and then use objects to call methods?
Is it best not to use inline virtual functions, since they're almost never expanded anyway?
Code snippet I used for analysis:
class Temp
{
public:
virtual ~Temp()
{
}
virtual void myVirtualFunction() const
{
cout<<"Temp::myVirtualFunction"<<endl;
}
};
class TempDerived : public Temp
{
public:
void myVirtualFunction() const
{
cout<<"TempDerived::myVirtualFunction"<<endl;
}
};
int main(void)
{
TempDerived aDerivedObj;
//Compiler thinks it's safe to expand the virtual functions
aDerivedObj.myVirtualFunction();
//type of object Temp points to is always known;
//does compiler still expand virtual functions?
//I doubt compiler would be this much intelligent!
Temp* pTemp = &aDerivedObj;
pTemp->myVirtualFunction();
return 0;
}
Virtual functions can be inlined sometimes. An excerpt from the excellent C++ faq:
"The only time an inline virtual call
can be inlined is when the compiler
knows the "exact class" of the object
which is the target of the virtual
function call. This can happen only
when the compiler has an actual object
rather than a pointer or reference to
an object. I.e., either with a local
object, a global/static object, or a
fully contained object inside a
composite."
C++11 has added final. This changes the accepted answer: it's no longer necessary to know the exact class of the object, it's sufficient to know the object has at least the class type in which the function was declared final:
class A {
virtual void foo();
};
class B : public A {
inline virtual void foo() final { }
};
class C : public B
{
};
void bar(B const& b) {
A const& a = b; // Allowed, every B is an A.
a.foo(); // Call to B::foo() can be inlined, even if b is actually a class C.
}
There is one category of virtual functions where it still makes sense to have them inline. Consider the following case:
class Base {
public:
inline virtual ~Base () { }
};
class Derived1 : public Base {
inline virtual ~Derived1 () { } // Implicitly calls Base::~Base ();
};
class Derived2 : public Derived1 {
inline virtual ~Derived2 () { } // Implicitly calls Derived1::~Derived1 ();
};
void foo (Base * base) {
delete base; // Virtual call
}
The call to delete 'base', will perform a virtual call to call correct derived class destructor, this call is not inlined. However because each destructor calls it's parent destructor (which in these cases are empty), the compiler can inline those calls, since they do not call the base class functions virtually.
The same principle exists for base class constructors or for any set of functions where the derived implementation also calls the base classes implementation.
I've seen compilers that don't emit any v-table if no non-inline function at all exists (and defined in one implementation file instead of a header then). They would throw errors like missing vtable-for-class-A or something similar, and you would be confused as hell, as i was.
Indeed, that's not conformant with the Standard, but it happens so consider putting at least one virtual function not in the header (if only the virtual destructor), so that the compiler could emit a vtable for the class at that place. I know it happens with some versions of gcc.
As someone mentioned, inline virtual functions can be a benefit sometimes, but of course most often you will use it when you do not know the dynamic type of the object, because that was the whole reason for virtual in the first place.
The compiler however can't completely ignore inline. It has other semantics apart from speeding up a function-call. The implicit inline for in-class definitions is the mechanism which allows you to put the definition into the header: Only inline functions can be defined multiple times throughout the whole program without a violation any rules. In the end, it behaves as you would have defined it only once in the whole program, even though you included the header multiple times into different files linked together.
Well, actually virtual functions can always be inlined, as long they're statically linked together: suppose we have an abstract class Base with a virtual function F and derived classes Derived1 and Derived2:
class Base {
virtual void F() = 0;
};
class Derived1 : public Base {
virtual void F();
};
class Derived2 : public Base {
virtual void F();
};
An hypotetical call b->F(); (with b of type Base*) is obviously virtual. But you (or the compiler...) could rewrite it like so (suppose typeof is a typeid-like function that returns a value that can be used in a switch)
switch (typeof(b)) {
case Derived1: b->Derived1::F(); break; // static, inlineable call
case Derived2: b->Derived2::F(); break; // static, inlineable call
case Base: assert(!"pure virtual function call!");
default: b->F(); break; // virtual call (dyn-loaded code)
}
while we still need RTTI for the typeof, the call can effectively be inlined by, basically, embedding the vtable inside the instruction stream and specializing the call for all the involved classes. This could be also generalized by specializing only a few classes (say, just Derived1):
switch (typeof(b)) {
case Derived1: b->Derived1::F(); break; // hot path
default: b->F(); break; // default virtual call, cold path
}
inline really doesn't do anything - it's a hint. The compiler might ignore it or it might inline a call event without inline if it sees the implementation and likes this idea. If code clarity is at stake the inline should be removed.
Marking a virtual method inline, helps in further optimizing virtual functions in following two cases:
Curiously recurring template pattern (http://www.codeproject.com/Tips/537606/Cplusplus-Prefer-Curiously-Recurring-Template-Patt)
Replacing virtual methods with templates (http://www.di.unipi.it/~nids/docs/templates_vs_inheritance.html)
Inlined declared Virtual functions are inlined when called through objects and ignored when called via pointer or references.
With modern compilers, it won't do any harm to inlibe them. Some ancient compiler/linker combos might have created multiple vtables, but I don't believe that is an issue anymore.
A compiler can only inline a function when the call can be resolved unambiguously at compile time.
Virtual functions, however are resolved at runtime, and so the compiler cannot inline the call, since at compile type the dynamic type (and therefore the function implementation to be called) cannot be determined.
In the cases where the function call is unambiguous and the function a suitable candidate for inlining, the compiler is smart enough to inline the code anyway.
The rest of the time "inline virtual" is a nonsense, and indeed some compilers won't compile that code.
It does make sense to make virtual functions and then call them on objects rather than references or pointers. Scott Meyer recommends, in his book "effective c++", to never redefine an inherited non-virtual function. That makes sense, because when you make a class with a non-virtual function and redefine the function in a derived class, you may be sure to use it correctly yourself, but you can't be sure others will use it correctly. Also, you may at a later date use it incorrectly yoruself. So, if you make a function in a base class and you want it to be redifinable, you should make it virtual. If it makes sense to make virtual functions and call them on objects, it also makes sense to inline them.
Actually in some cases adding "inline" to a virtual final override can make your code not compile so there is sometimes a difference (at least under VS2017s compiler)!
Actually I was doing a virtual inline final override function in VS2017 adding c++17 standard to compile and link and for some reason it failed when I am using two projects.
I had a test project and an implementation DLL that I am unit testing. In the test project I am having a "linker_includes.cpp" file that #include the *.cpp files from the other project that are needed. I know... I know I can set up msbuild to use the object files from the DLL, but please bear in mind that it is a microsoft specific solution while including the cpp files is unrelated to build-system and much more easier to version a cpp file than xml files and project settings and such...
What was interesting is that I was constantly getting linker error from the test project. Even if I added the definition of the missing functions by copy paste and not through include! So weird. The other project have built and there are no connection between the two other than marking a project reference so there is a build order to ensure both is always built...
I think it is some kind of bug in the compiler. I have no idea if it exists in the compiler shipped with VS2020, because I am using an older version because some SDK only works with that properly :-(
I just wanted to add that not only marking them as inline can mean something, but might even make your code not build in some rare circumstances! This is weird, yet good to know.
PS.: The code I am working on is computer graphics related so I prefer inlining and that is why I used both final and inline. I kept the final specifier to hope the release build is smart enough to build the DLL by inlining it even without me directly hinting so...
PS (Linux).: I expect the same does not happen in gcc or clang as I routinely used to do these kind of things. I am not sure where this issue comes from... I prefer doing c++ on Linux or at least with some gcc, but sometimes project is different in needs.
I got this question when I received a code review comment saying virtual functions need not be inline.
I thought inline virtual functions could come in handy in scenarios where functions are called on objects directly. But the counter-argument came to my mind is -- why would one want to define virtual and then use objects to call methods?
Is it best not to use inline virtual functions, since they're almost never expanded anyway?
Code snippet I used for analysis:
class Temp
{
public:
virtual ~Temp()
{
}
virtual void myVirtualFunction() const
{
cout<<"Temp::myVirtualFunction"<<endl;
}
};
class TempDerived : public Temp
{
public:
void myVirtualFunction() const
{
cout<<"TempDerived::myVirtualFunction"<<endl;
}
};
int main(void)
{
TempDerived aDerivedObj;
//Compiler thinks it's safe to expand the virtual functions
aDerivedObj.myVirtualFunction();
//type of object Temp points to is always known;
//does compiler still expand virtual functions?
//I doubt compiler would be this much intelligent!
Temp* pTemp = &aDerivedObj;
pTemp->myVirtualFunction();
return 0;
}
Virtual functions can be inlined sometimes. An excerpt from the excellent C++ faq:
"The only time an inline virtual call
can be inlined is when the compiler
knows the "exact class" of the object
which is the target of the virtual
function call. This can happen only
when the compiler has an actual object
rather than a pointer or reference to
an object. I.e., either with a local
object, a global/static object, or a
fully contained object inside a
composite."
C++11 has added final. This changes the accepted answer: it's no longer necessary to know the exact class of the object, it's sufficient to know the object has at least the class type in which the function was declared final:
class A {
virtual void foo();
};
class B : public A {
inline virtual void foo() final { }
};
class C : public B
{
};
void bar(B const& b) {
A const& a = b; // Allowed, every B is an A.
a.foo(); // Call to B::foo() can be inlined, even if b is actually a class C.
}
There is one category of virtual functions where it still makes sense to have them inline. Consider the following case:
class Base {
public:
inline virtual ~Base () { }
};
class Derived1 : public Base {
inline virtual ~Derived1 () { } // Implicitly calls Base::~Base ();
};
class Derived2 : public Derived1 {
inline virtual ~Derived2 () { } // Implicitly calls Derived1::~Derived1 ();
};
void foo (Base * base) {
delete base; // Virtual call
}
The call to delete 'base', will perform a virtual call to call correct derived class destructor, this call is not inlined. However because each destructor calls it's parent destructor (which in these cases are empty), the compiler can inline those calls, since they do not call the base class functions virtually.
The same principle exists for base class constructors or for any set of functions where the derived implementation also calls the base classes implementation.
I've seen compilers that don't emit any v-table if no non-inline function at all exists (and defined in one implementation file instead of a header then). They would throw errors like missing vtable-for-class-A or something similar, and you would be confused as hell, as i was.
Indeed, that's not conformant with the Standard, but it happens so consider putting at least one virtual function not in the header (if only the virtual destructor), so that the compiler could emit a vtable for the class at that place. I know it happens with some versions of gcc.
As someone mentioned, inline virtual functions can be a benefit sometimes, but of course most often you will use it when you do not know the dynamic type of the object, because that was the whole reason for virtual in the first place.
The compiler however can't completely ignore inline. It has other semantics apart from speeding up a function-call. The implicit inline for in-class definitions is the mechanism which allows you to put the definition into the header: Only inline functions can be defined multiple times throughout the whole program without a violation any rules. In the end, it behaves as you would have defined it only once in the whole program, even though you included the header multiple times into different files linked together.
Well, actually virtual functions can always be inlined, as long they're statically linked together: suppose we have an abstract class Base with a virtual function F and derived classes Derived1 and Derived2:
class Base {
virtual void F() = 0;
};
class Derived1 : public Base {
virtual void F();
};
class Derived2 : public Base {
virtual void F();
};
An hypotetical call b->F(); (with b of type Base*) is obviously virtual. But you (or the compiler...) could rewrite it like so (suppose typeof is a typeid-like function that returns a value that can be used in a switch)
switch (typeof(b)) {
case Derived1: b->Derived1::F(); break; // static, inlineable call
case Derived2: b->Derived2::F(); break; // static, inlineable call
case Base: assert(!"pure virtual function call!");
default: b->F(); break; // virtual call (dyn-loaded code)
}
while we still need RTTI for the typeof, the call can effectively be inlined by, basically, embedding the vtable inside the instruction stream and specializing the call for all the involved classes. This could be also generalized by specializing only a few classes (say, just Derived1):
switch (typeof(b)) {
case Derived1: b->Derived1::F(); break; // hot path
default: b->F(); break; // default virtual call, cold path
}
inline really doesn't do anything - it's a hint. The compiler might ignore it or it might inline a call event without inline if it sees the implementation and likes this idea. If code clarity is at stake the inline should be removed.
Marking a virtual method inline, helps in further optimizing virtual functions in following two cases:
Curiously recurring template pattern (http://www.codeproject.com/Tips/537606/Cplusplus-Prefer-Curiously-Recurring-Template-Patt)
Replacing virtual methods with templates (http://www.di.unipi.it/~nids/docs/templates_vs_inheritance.html)
Inlined declared Virtual functions are inlined when called through objects and ignored when called via pointer or references.
With modern compilers, it won't do any harm to inlibe them. Some ancient compiler/linker combos might have created multiple vtables, but I don't believe that is an issue anymore.
A compiler can only inline a function when the call can be resolved unambiguously at compile time.
Virtual functions, however are resolved at runtime, and so the compiler cannot inline the call, since at compile type the dynamic type (and therefore the function implementation to be called) cannot be determined.
In the cases where the function call is unambiguous and the function a suitable candidate for inlining, the compiler is smart enough to inline the code anyway.
The rest of the time "inline virtual" is a nonsense, and indeed some compilers won't compile that code.
It does make sense to make virtual functions and then call them on objects rather than references or pointers. Scott Meyer recommends, in his book "effective c++", to never redefine an inherited non-virtual function. That makes sense, because when you make a class with a non-virtual function and redefine the function in a derived class, you may be sure to use it correctly yourself, but you can't be sure others will use it correctly. Also, you may at a later date use it incorrectly yoruself. So, if you make a function in a base class and you want it to be redifinable, you should make it virtual. If it makes sense to make virtual functions and call them on objects, it also makes sense to inline them.
Actually in some cases adding "inline" to a virtual final override can make your code not compile so there is sometimes a difference (at least under VS2017s compiler)!
Actually I was doing a virtual inline final override function in VS2017 adding c++17 standard to compile and link and for some reason it failed when I am using two projects.
I had a test project and an implementation DLL that I am unit testing. In the test project I am having a "linker_includes.cpp" file that #include the *.cpp files from the other project that are needed. I know... I know I can set up msbuild to use the object files from the DLL, but please bear in mind that it is a microsoft specific solution while including the cpp files is unrelated to build-system and much more easier to version a cpp file than xml files and project settings and such...
What was interesting is that I was constantly getting linker error from the test project. Even if I added the definition of the missing functions by copy paste and not through include! So weird. The other project have built and there are no connection between the two other than marking a project reference so there is a build order to ensure both is always built...
I think it is some kind of bug in the compiler. I have no idea if it exists in the compiler shipped with VS2020, because I am using an older version because some SDK only works with that properly :-(
I just wanted to add that not only marking them as inline can mean something, but might even make your code not build in some rare circumstances! This is weird, yet good to know.
PS.: The code I am working on is computer graphics related so I prefer inlining and that is why I used both final and inline. I kept the final specifier to hope the release build is smart enough to build the DLL by inlining it even without me directly hinting so...
PS (Linux).: I expect the same does not happen in gcc or clang as I routinely used to do these kind of things. I am not sure where this issue comes from... I prefer doing c++ on Linux or at least with some gcc, but sometimes project is different in needs.
My understanding is that virtual functions can cause performance problems because of two issues: the extra derefencing caused by the vtable and the inability of compilers to inline functions in polymorphic code.
What if I downcast a variable pointer to its exact type? Are there still any extra costs then?
class Base { virtual void foo() = 0; };
class Derived : public Base { void foo() { /* code */} };
int main() {
Base * pbase = new Derived();
pbase->foo(); // Can't inline this and have to go through vtable
Derived * pderived = dynamic_cast<Derived *>(pbase);
pderived->foo(); // Are there any costs due to the virtual method here?
}
My intuition tells me that since I cast the object to its actual type, the compiler should be able to avoid the disadvantages of using a virtual function (e.g., it should be able to inline the method call if it wants to). Is this correct?
Can the compiler actually know that pderived is of type Derived after I downcast it? In the example above its trivial to see that pbase is of type Derived but in actual code it might be unknown at compile time.
Now that I've written this down, I suppose that since the Derived class could itself be inherited by another class, downcasting pbase to a Derived pointer does not actually ensure anything to the compiler and thus it is not able to avoid the costs of having a virtual function?
There's always a gap between what the mythical Sufficiently Smart Compiler can do, and what actual compilers end up doing. In your example, since there is nothing inheriting from Derived, the latest compilers will likely devirtualize the call to foo. However, since successful devirtualization and subsequent inlining is a difficult problem in general, help the compiler out whenever possible by using the final keyword.
class Derived : public Base { void foo() final { /* code */} }
Now, the compiler knows that there's only one possible foo that a Derived* can call.
(For an in-depth discussion of why devirtualization is hard and how gcc4.9+ tackles it, read Jan Hubicka's Devirtualization in C++ series posts.)
Pradhan's advice to use final is sound, if changing the Derived class is an option for you and you don't want any further derivation.
Another option directly available to specific call sites is prefixing the function name with Derived::, inhibiting virtual dispatch to any further override:
#include <iostream>
struct Base { virtual ~Base() { } virtual void foo() = 0; };
struct Derived : public Base
{
void foo() override { std::cout << "Derived\n"; }
};
struct FurtherDerived : public Derived
{
void foo() override { std::cout << "FurtherDerived\n"; }
};
int main()
{
Base* pbase = new FurtherDerived();
pbase->foo(); // Can't inline this and have to go through vtable
if (Derived* pderived = dynamic_cast<Derived *>(pbase))
{
pderived->foo(); // still dispatched to FurtherDerived
pderived->Derived::foo(); // static dispatch to Derived
}
}
Output:
FurtherDerived
FurtherDerived
Derived
This can be dangerous: the actual runtime type might depend on its overrides being called to maintain its invariants, so it's a bad idea to use it unless there're pressing performance problems.
Code available here.
De-virtualization is, actually, a very special case of constant propagation, where the constant propagated is the type (physically represented as a v-ptr in general, but the Standard makes not such guarantee).
Total devirtualization
There are multiple situations where a compiler can actually devirtualize a call that you may not think about:
int main() {
Base* base = new Derived();
base->foo();
}
Clang is able to devirtualize the call in the above example simply because it can track the actual type of base as it is created in scope.
In a similar vein:
struct Base { virtual void foo() = 0; };
struct Derived: Base { virtual void foo() override {} };
Base* create() { return new Derived(); }
int main() {
Base* base = create();
base->foo();
}
while this example is slightly more complicated, and the Clang front-end will not realize that base is necessarily of type Derived, the LLVM optimizer which comes afterward will:
inline create in main
store a pointer to the v-table of Derived in base->vptr
realize that base->foo() therefore is base->Derived::foo() (by resolving the indirection through the v-ptr)
and finally optimize everything out because there is nothing to do in Derived::foo
And here is the final result (which I assume needs no comment even for those not initiated to the LLVM IR):
define i32 #main() #0 {
ret i32 0
}
There are multiple instances where a compiler (either front-end or back-end) can devirtualize calls in situations that might not be obvious, in all cases it boils down to its ability to prove the run-time type of the object pointed to.
Partial devirtualization
In his serie about improvements to the gcc compiler on the subject of devirutalization Jan Hubička introduces partial devirtualization.
The latest incarnations of gcc have the ability to short-list a few likely run-time types of the object, and especially produce the following pseudo-code (in this case, two are deemed likely, and not all are known or likely enough to justify a special case):
// Source
void doit(Base* base) { base->foo(); }
// Optimized
void doit(Base* base) {
if (base->vptr == &Derived::VTable) { base->Derived::foo(); }
else if (base->ptr == &Other::VTable) { base->Other::foo(); }
else {
(*base->vptr[Base::VTable::FooIndex])(base);
}
}
While this may seem slightly convoluted, it does offer some performance gains (as you'll see from the serie of articles) in case the predictions are correct.
Seems surprising? Well, there are more tests, but base->Derived::foo() and base->Other::foo() can now be inlined, which itself opens up further optimization opportunities:
in this particular case, since Derived::foo() does nothing, the function call can be optimized away; the penalty of the if test is less than that of a function call so it's worth it if the condition matches often enough
in cases where one of the function arguments is known, or known to have some specific properties, the subsequent constant propagation passes can simplify the inlined body of the function
Impressive, right?
Alright, alright, this is rather long-winded but I am coming to talk about dynamic_cast<Derived*>(base)!
First of all, the cost of a dynamic_cast is not to be underestimated; it might well, actually, be more costly than calling base->foo() in the first place, you've been warned.
Secondly, using dynamic_cast<Derived*>(base)->foo() can, indeed, allow devirtualizing the function call if it gives sufficient information to the compiler to do so (it always gives more information, at least). Typically, this can be either:
because Derived::foo is final
because Derived is final
because Derived is defined in an anonymous namespace and has no descendant redefining foo, and thus only accessible in this translation unit (roughly, .cpp file) and so all its descendants are known and can be checked
and plenty of other cases (like pruning the set of potential candidates in the case of partial devirtualization)
If you really wish to ensure devirtualization, though, final applied either on the function or class is your best bet.
struct A{
virtual void fun(){cout<<"A";}
};
struct B:public A{
void fun(){cout<<"B";}
};
struct C:public B{
void fun(){cout<<"C";}
};
int main()
{
C c;B b1;
A *a=&b1;
a->fun(); //1
B *b=&c;
b->fun(); //2
return 0;
}
In the above code B::fun() is getting converted to virtual function implicitly as I have made A::fun() virtual. Can I stop this conversion?
If not possible what are the alternatives to make the above code print "BB" ?
A virtual function is virtual in all derived classes. There is no way to prevent this.
(§10.3/2 C++11) If a virtual member function vf is declared in a class Base and in a class Derived, derived directly or indirectly from Base, a member function vf with the same name, parameter-type-list (8.3.5), cv-qualification, and ref-qualifier (or absence of same) as Base::vf is declared, then Derived::vf is also virtual (whether or not it is so declared) and it overrides Base::vf. For convenience we say that any virtual function overrides itself.
However, if you'd like to use the function that corresponds to the static, rather than the dynamic, type of a pointer (i.e., in your example, B::fun instead of C::fun, given that the pointer is declared as B*), then you can, at least in C++11, use the alias definition below to get access to the static (=compile-time) type:
template <typename Ptr>
using static_type = typename std::remove_pointer<Ptr>::type;
This is how you'd use this in main() (or anywhere else):
int main()
{
C c; B b1;
A *a = &b1;
a->fun();
B *b = &c;
/* This will output 'B': */
b->static_type<decltype(b)>::fun();
return 0;
}
If you do not want your derived classes to override the function then there is no reason why you should mark it virtual in base class. The very basis of marking a function virtual is to have polymorphic behavior through derived class function overidding.
Good Read:
When to mark a function in C++ as a virtual?
If you want your code to guard you against accidental overidding in derived classes.You can use the final specifier in C++11.
Yes, if you want to explicitly call a function in a specific class you can use a fully qualified name.
b->A::fun();
This will call the version of fun() belonging to A.
The following achieves the observable behaviour you're asking for. In A, non-virtual fun() run virtual fun_() so the behaviour can be customised in B, but anyone calling fun() on a derived class will only see the non-polymorphic version.
#include <iostream>
using namespace std;
struct A{
void fun(){fun_();}
private:
virtual void fun_() { cout << "A\n"; }
};
struct B:public A{
void fun(){cout<<"B\n";}
private:
virtual void fun_() final { fun(); }
};
struct C:public B{
void fun(){cout<<"C\n";}
};
int main()
{
C c;B b1;
A *a=&b1;
a->fun(); //1
B *b=&c;
b->fun(); //2
c.fun(); // notice that this outputs "C" which I think is what you want
}
If using C++03, you can simply leave out the "final" keyword - it's only there to guard against further unwanted overrides of the virtual behaviour in B-derived classes such as C.
(You might find it interesting to contrast this with the "Nonvirtual Interface pattern" - see C++ Coding Standards by Sutter and Alexandrescu, point 39)
Discussion
A having fun virtual implies that overriding it in derived classes is a necessary customisation ability for derived classes, but at some point in the derivation hierarchy the choice of implementation behaviours might have narrowed down to 1 and providing a final implementation's not unreasonable.
My real concern is that you hide A/B's fun() with C::fun... that's troubling as if they do different things then your code could be very hard to reason about or debug. B's decision to finalise the virtual function implies certainty that there's no need for such further customisation. Code working from A*/A&/B*/B& will do one thing, while wherever a C object's type is statically known, the behaviour may differ. Templated code is one place where C::fun may easily be called without the template author or user being very conscious of it. To assess whether this is a genuine hazard for you, it would help to know what the functional purpose of "fun" is and how implementation might differ between A, B and C....
If you declare the function in B like this
void fun(int ignored=0);
it will become an overload which will not take part in resolving virtual calls. Beware that calling a->fun() will call A::fun() though even if a actually refers to a B, so I would strongly advise against this approach as it makes things even more confusing than necessary.
Question is: What exactly is it that you want to achieve or avoid? Knowing that, people here could suggest a better approach.
My library has two classes, a base class and a derived class. In the current version of the library the base class has a virtual function foo(), and the derived class does not override it. In the next version I'd like the derived class to override it. Does this break ABI? I know that introducing a new virtual function usually does, but this seems like a special case. My intuition is that it should be changing an offset in the vtbl, without actually changing the table's size.
Obviously since the C++ standard doesn't mandate a particular ABI this question is somewhat platform specific, but in practice what breaks and maintains ABI is similar across most compilers. I'm interested in GCC's behavior, but the more compilers people can answer for the more useful this question will be ;)
It might.
You're wrong regarding the offset. The offset in the vtable is determined already. What will happen is that the Derived class constructor will replace the function pointer at that offset with the Derived override (by switching the in-class v-pointer to a new v-table). So it is, normally, ABI compatible.
There might be an issue though, because of optimization, and especially the devirtualization of function calls.
Normally, when you call a virtual function, the compiler introduces a lookup in the vtable via the vpointer. However, if it can deduce (statically) what the exact type of the object is, it can also deduce the exact function to call and shave off the virtual lookup.
Example:
struct Base {
virtual void foo();
virtual void bar();
};
struct Derived: Base {
virtual void foo();
};
int main(int argc, char* argv[]) {
Derived d;
d.foo(); // It is necessarily Derived::foo
d.bar(); // It is necessarily Base::bar
}
And in this case... simply linking with your new library will not pick up Derived::bar.
This doesn't seem like something that could be particularly relied on in general - as you said C++ ABI is pretty tricky (even down to compiler options).
That said I think you could use g++ -fdump-class-hierarchy before and after you made the change to see if either the parent or child vtables change in structure. If they don't it's probably "fairly" safe to assume you didn't break ABI.
Yes, in some situations, adding a reimplementation of a virtual function will change the layout of the virtual function table. That is the case if you're reimplementing a virtual function from a base that isn't the first base class (multiple-inheritance):
// V1
struct A { virtual void f(); };
struct B { virtual void g(); };
struct C : A, B { virtual void h(); }; //does not reimplement f or g;
// V2
struct C : A, B {
virtual void h();
virtual void g(); //added reimplementation of g()
};
This changes the layout of C's vtable by adding an entry for g() (thanks to "Gof" for bringing this to my attention in the first place, as a comment in http://marcmutz.wordpress.com/2010/07/25/bcsc-gotcha-reimplementing-a-virtual-function/).
Also, as mentioned elsewhere, you get a problem if the class you're overriding the function in is used by users of your library in a way where the static type is equal to the dynamic type. This can be the case after you new'ed it:
MyClass * c = new MyClass;
c->myVirtualFunction(); // not actually virtual at runtime
or created it on the stack:
MyClass c;
c.myVirtualFunction(); // not actually virtual at runtime
The reason for this is an optimisation called "de-virtualisation". If the compiler can prove, at compile time, what the dynamic type of the object is, it will not emit the indirection through the virtual function table, but instead call the correct function directly.
Now, if users compiled against an old version of you library, the compiler will have inserted a call to the most-derived reimplementation of the virtual method. If, in a newer version of your library, you override this virtual function in a more-derived class, code compiled against the old library will still call the old function, whereas new code or code where the compiler could not prove the dynamic type of the object at compile time, will go through the virtual function table. So, a given instance of the class may be confronted, at runtime, with calls to the base class' function that it cannot intercept, potentially creating violations of class invariants.
My intuition is that it should be changing an offset in the vtbl, without actually changing the table's size.
Well, your intuition is clearly wrong:
either there is a new entry in the vtable for the overrider, all following entries are moved, and the table grows,
or there is no new entry, and the vtable representation does not change.
Which one is true can depends on many factors.
Anyway: do not count on it.
Caution: see In C++, does overriding an existing virtual function break ABI? for a case where this logic doesn't hold true;
In my mind Mark's suggestion to use g++ -fdump-class-hierarchy would be the winner here, right after having proper regression tests
Overriding things should not change vtable layout[1]. The vtable entries itself would be in the datasegment of the library, IMHO, so a change to it should not pose a problem.
Of course, the applications need to be relinked, otherwise there is a potential for breakage if the consumer had been using direct reference to &Derived::overriddenMethod;
I'm not sure whether a compiler would have been allowed to resolve that to &Base::overriddenMethod at all, but better safe than sorry.
[1] spelling it out: this presumes that the method was virtual to begin with!