This question already has answers here:
Are inline virtual functions really a non-sense?
(13 answers)
inline virtual function
(3 answers)
Closed 9 years ago.
If I define a class like this:
class A{
public:
A(){}
virtual ~A(){}
virtual void func(){}
};
Does it mean that that the virtual destructor and func are inlined
Whether the compiler chooses to inline a function which is defined inline is entirely up to the compiler. In general, virtual functions can only be inlined when the compiler can either prove that the static type matches the dynamic type or when the compiler can safely determine the dynamic type. For example, when you use a value of type A the compiler knows that the dynamic type cannot be different and it can inline the function. When using a pointer or a reference the compiler generally cannot prove that the static type is the same and virtual functions generally need to follow the usual virtual dispatch. However, even when a pointer is used, the compiler may have enough information from the context to know the exact dynamic type. For example, MatthieuM. gave the following exmaple:
A* a = new B;
a->func();
In this case the compiler can determine that a points to a B object and, thus, call the correct version of func() without dynamic dispatch. Without the need for the dynamic dispatch, func() could then be inlined. Of course, whether compilers do the corresponding analysis depends on its respective implementation.
As hvd correctly pointed out, the virtual dispatch can be circumvented by calling a virtual function will full qualification, e.g., a->A::func(), in which case the virtual function can also be inlined. The main reason virtual functions are generally not inlined is the need to do a virtual dispatch. With the full qualification the function to be called is, however, known.
Yes, and in multiple ways. You can see some examples of devirtualization in this email I sent to the Clang mailing list about 2 years ago.
Like all optimizations, this is pending the compiler abilities to eliminate alternatives: if it can prove that the virtual call is always resolved in Derived::func then it can call it directly.
There are various situations, let us start first with the semantic evidences:
SomeDerived& d where SomeDerived is final allows to devirtualization of all method calls
SomeDerived& d, d.foo() where foo is final also allows devirtualization of this particular call
Then, there are situations where you know the dynamic type of the object:
SomeDerived d; => the dynamic type of d is necessarily SomeDerived
SomeDerived d; Base& b; => the dynamic type of b is necessarily SomeDerived
Those 4 devirtualization situations are usually solved by the compiler front-end because they require fundamental knowledge about the language semantics. I can attest that all 4 are implemented in Clang, and I would think they are also implemented in gcc.
However, there are plenty of situations where this breaks down:
struct Base { virtual void foo() = 0; };
struct Derived: Base { virtual void foo() { std::cout << "Hello, World!\n"; };
void opaque(Base& b);
void print(Base& b) { b.foo(); }
int main() {
Derived d;
opaque(d);
print(d);
}
Even though here it is obvious that the call to foo is resolved to Derived::foo, Clang/LLVM will not optimize it. The issue is that:
Clang (front-end) does not perform inlining, thus it cannot replace print(d) by d.foo() and devirtualize the call
LLVM (back-end) does not know the semantics of the language, thus even after replacing print(d) by d.foo() it assumes that the virtual pointer of d could have been changed by opaque (whose definition is opaque, as the name implies)
I've followed efforts on the Clang and LLVM mailing list as both sets of developers reasoned about the loss of information and how to get Clang to tell LLVM: "it's okay" but unfortunately the issue is non-trivial and has not been solved yet... thus the half-assed devirtualization in the front-end to try and get all obvious cases, and some not so obvious ones (even though, by convention, the front-end is not where you implement them).
For reference, the code for the devirtualization in Clang can be found in CGExprCXX.cpp in a function called canDevirtualizeMemberFunctionCalls. It's only ~64 lines long (right now) and thoroughly commented.
Related
I got this question when I received a code review comment saying virtual functions need not be inline.
I thought inline virtual functions could come in handy in scenarios where functions are called on objects directly. But the counter-argument came to my mind is -- why would one want to define virtual and then use objects to call methods?
Is it best not to use inline virtual functions, since they're almost never expanded anyway?
Code snippet I used for analysis:
class Temp
{
public:
virtual ~Temp()
{
}
virtual void myVirtualFunction() const
{
cout<<"Temp::myVirtualFunction"<<endl;
}
};
class TempDerived : public Temp
{
public:
void myVirtualFunction() const
{
cout<<"TempDerived::myVirtualFunction"<<endl;
}
};
int main(void)
{
TempDerived aDerivedObj;
//Compiler thinks it's safe to expand the virtual functions
aDerivedObj.myVirtualFunction();
//type of object Temp points to is always known;
//does compiler still expand virtual functions?
//I doubt compiler would be this much intelligent!
Temp* pTemp = &aDerivedObj;
pTemp->myVirtualFunction();
return 0;
}
Virtual functions can be inlined sometimes. An excerpt from the excellent C++ faq:
"The only time an inline virtual call
can be inlined is when the compiler
knows the "exact class" of the object
which is the target of the virtual
function call. This can happen only
when the compiler has an actual object
rather than a pointer or reference to
an object. I.e., either with a local
object, a global/static object, or a
fully contained object inside a
composite."
C++11 has added final. This changes the accepted answer: it's no longer necessary to know the exact class of the object, it's sufficient to know the object has at least the class type in which the function was declared final:
class A {
virtual void foo();
};
class B : public A {
inline virtual void foo() final { }
};
class C : public B
{
};
void bar(B const& b) {
A const& a = b; // Allowed, every B is an A.
a.foo(); // Call to B::foo() can be inlined, even if b is actually a class C.
}
There is one category of virtual functions where it still makes sense to have them inline. Consider the following case:
class Base {
public:
inline virtual ~Base () { }
};
class Derived1 : public Base {
inline virtual ~Derived1 () { } // Implicitly calls Base::~Base ();
};
class Derived2 : public Derived1 {
inline virtual ~Derived2 () { } // Implicitly calls Derived1::~Derived1 ();
};
void foo (Base * base) {
delete base; // Virtual call
}
The call to delete 'base', will perform a virtual call to call correct derived class destructor, this call is not inlined. However because each destructor calls it's parent destructor (which in these cases are empty), the compiler can inline those calls, since they do not call the base class functions virtually.
The same principle exists for base class constructors or for any set of functions where the derived implementation also calls the base classes implementation.
I've seen compilers that don't emit any v-table if no non-inline function at all exists (and defined in one implementation file instead of a header then). They would throw errors like missing vtable-for-class-A or something similar, and you would be confused as hell, as i was.
Indeed, that's not conformant with the Standard, but it happens so consider putting at least one virtual function not in the header (if only the virtual destructor), so that the compiler could emit a vtable for the class at that place. I know it happens with some versions of gcc.
As someone mentioned, inline virtual functions can be a benefit sometimes, but of course most often you will use it when you do not know the dynamic type of the object, because that was the whole reason for virtual in the first place.
The compiler however can't completely ignore inline. It has other semantics apart from speeding up a function-call. The implicit inline for in-class definitions is the mechanism which allows you to put the definition into the header: Only inline functions can be defined multiple times throughout the whole program without a violation any rules. In the end, it behaves as you would have defined it only once in the whole program, even though you included the header multiple times into different files linked together.
Well, actually virtual functions can always be inlined, as long they're statically linked together: suppose we have an abstract class Base with a virtual function F and derived classes Derived1 and Derived2:
class Base {
virtual void F() = 0;
};
class Derived1 : public Base {
virtual void F();
};
class Derived2 : public Base {
virtual void F();
};
An hypotetical call b->F(); (with b of type Base*) is obviously virtual. But you (or the compiler...) could rewrite it like so (suppose typeof is a typeid-like function that returns a value that can be used in a switch)
switch (typeof(b)) {
case Derived1: b->Derived1::F(); break; // static, inlineable call
case Derived2: b->Derived2::F(); break; // static, inlineable call
case Base: assert(!"pure virtual function call!");
default: b->F(); break; // virtual call (dyn-loaded code)
}
while we still need RTTI for the typeof, the call can effectively be inlined by, basically, embedding the vtable inside the instruction stream and specializing the call for all the involved classes. This could be also generalized by specializing only a few classes (say, just Derived1):
switch (typeof(b)) {
case Derived1: b->Derived1::F(); break; // hot path
default: b->F(); break; // default virtual call, cold path
}
inline really doesn't do anything - it's a hint. The compiler might ignore it or it might inline a call event without inline if it sees the implementation and likes this idea. If code clarity is at stake the inline should be removed.
Marking a virtual method inline, helps in further optimizing virtual functions in following two cases:
Curiously recurring template pattern (http://www.codeproject.com/Tips/537606/Cplusplus-Prefer-Curiously-Recurring-Template-Patt)
Replacing virtual methods with templates (http://www.di.unipi.it/~nids/docs/templates_vs_inheritance.html)
Inlined declared Virtual functions are inlined when called through objects and ignored when called via pointer or references.
With modern compilers, it won't do any harm to inlibe them. Some ancient compiler/linker combos might have created multiple vtables, but I don't believe that is an issue anymore.
A compiler can only inline a function when the call can be resolved unambiguously at compile time.
Virtual functions, however are resolved at runtime, and so the compiler cannot inline the call, since at compile type the dynamic type (and therefore the function implementation to be called) cannot be determined.
In the cases where the function call is unambiguous and the function a suitable candidate for inlining, the compiler is smart enough to inline the code anyway.
The rest of the time "inline virtual" is a nonsense, and indeed some compilers won't compile that code.
It does make sense to make virtual functions and then call them on objects rather than references or pointers. Scott Meyer recommends, in his book "effective c++", to never redefine an inherited non-virtual function. That makes sense, because when you make a class with a non-virtual function and redefine the function in a derived class, you may be sure to use it correctly yourself, but you can't be sure others will use it correctly. Also, you may at a later date use it incorrectly yoruself. So, if you make a function in a base class and you want it to be redifinable, you should make it virtual. If it makes sense to make virtual functions and call them on objects, it also makes sense to inline them.
Actually in some cases adding "inline" to a virtual final override can make your code not compile so there is sometimes a difference (at least under VS2017s compiler)!
Actually I was doing a virtual inline final override function in VS2017 adding c++17 standard to compile and link and for some reason it failed when I am using two projects.
I had a test project and an implementation DLL that I am unit testing. In the test project I am having a "linker_includes.cpp" file that #include the *.cpp files from the other project that are needed. I know... I know I can set up msbuild to use the object files from the DLL, but please bear in mind that it is a microsoft specific solution while including the cpp files is unrelated to build-system and much more easier to version a cpp file than xml files and project settings and such...
What was interesting is that I was constantly getting linker error from the test project. Even if I added the definition of the missing functions by copy paste and not through include! So weird. The other project have built and there are no connection between the two other than marking a project reference so there is a build order to ensure both is always built...
I think it is some kind of bug in the compiler. I have no idea if it exists in the compiler shipped with VS2020, because I am using an older version because some SDK only works with that properly :-(
I just wanted to add that not only marking them as inline can mean something, but might even make your code not build in some rare circumstances! This is weird, yet good to know.
PS.: The code I am working on is computer graphics related so I prefer inlining and that is why I used both final and inline. I kept the final specifier to hope the release build is smart enough to build the DLL by inlining it even without me directly hinting so...
PS (Linux).: I expect the same does not happen in gcc or clang as I routinely used to do these kind of things. I am not sure where this issue comes from... I prefer doing c++ on Linux or at least with some gcc, but sometimes project is different in needs.
I'm using C++ in an embedded environment where the runtime of virtual functions does matter. I have read about the rare cases when virtual functions can be inlined, for example: Are inline virtual functions really a non-sense?
The accepted answer states that inlining is only possible when the exact class is known at runtime, for example when dealing with a local, global, or static object (not a pointer or reference to the base type). I understand the logic behind this, but I wonder if inlining would be also possible in the following case:
class Base {
public:
inline virtual void x() = 0;
}
class Derived final : Base {
public:
inline virtual void x(){
cout << "inlined?";
}
}
int main(){
Base* a;
Derived* b;
b = new Derived();
a = b;
a->x(); //This can definitely not be inlined.
b->x(); //Can this be inlined?
}
From my point of view the compiler should know the definitive type of a at compiletime, as it is a final class. Is it possible to inline the virtual function in this case? If not, then why? If yes, then does the gcc-compiler (respectively avr-gcc) does so?
Thanks!
The first step is called devirtualization; where a function call does not go through virtual dispatch.
Compilers can and do devirtualize final methods and methods of final classes. That is almost the entire point of final.
Once devirtualized, methods can be inlined.
Some compilers can sometimes prove the static type of *a and even devirtualize that. This is less reliable. Godbolt's compiler explorer can be useful to understand what specific optimizations can happen and how it can fail.
C++11 added final.
Finally!
I understand final does two things:
Makes a class non-inheritable.
Makes (virtual) functions in a class non-overridable (in a derived class).
Both of these seem independent of each other. But take for example the following:
class Foo
{
public:
virtual void bar()
{
//do something unimportant.
}
};
class Baz final : public Foo
{
public:
void bar() /*final*/ override
{
//do something more important than Foo's bar.
}
};
From above, I believe Baz being final, I should NOT need to specify that its virtual member function bar is also final. Since Baz cannot be inherited, the question of overriding bar goes out of scope. However my compiler VC++ 2015, is very quiet about this. I have not tested this on any others at the moment.
I would be glad if someone could shed some light on this topic. A quote from the standard (if any) would be extremely appreciated. Also please state any corner cases that I am unaware of, that may cause my logical belief to fail.
So, my question is: Does a final class implicitly imply its virtual functions to be final as well? Should it? Please clarify.
The reason I am asking this is because final functions become qualified for de-virtualization, which is a great optimization. Any help is appreciated.
The reason I am asking this is because final functions become qualified for de-virtualization, which is a great optimization.
Do they? "De-virtualization" is not part of the C++ standard. Or at least, not really.
De-virtualization is merely a consequence of the "as if" rule, which states that the implementation can do whatever it likes so long as the implementation behaves "as if" it is doing what the standard says.
If the compiler can detect at compile-time that a particular call to a virtual member function, through a polymorphic type, will undeniably call a specific version of that function, then it is allowed to avoid using the virtual dispatching logic and calling the function statically. That's behaving "as if" it had used the virtual dispatching logic, since the compiler can prove that this is the function that would have been called.
As such, the standard does not define when de-virtualization is allowed/forbidden. A compiler, upon inlining a function that takes a pointer to a base class type, may find that the pointer being passed is pointing to a stack variable local declared in the function that it is being inlined within. Or that the compiler can trace down a particular inline/call graph to the point of origin for a particular polymorphic pointer/reference. In those cases, the compiler can de-virtualize calls into that type. But only if it's smart enough to do so.
Will a compiler devirtualize all virtual function calls to a final class, regardless of whether those methods are declared final themselves? It may. It may not. It may not even devirtualize any calls to methods declared final on the polymorphic type. That's a valid (if not particularly bright) implementation.
The question you're asking is implementation specific. It can vary from compiler to compiler.
However, a class being declared final, as you pointed out, ought to be sufficient information for the compiler to devirtualize all calls to pointers/references to the final class type. If a compiler doesn't do so, then that's a quality-of-implementation issue, not a standards one.
To quote the draft C++ standard from here [class.virtual/4]:
If a virtual function f in some class B is marked with the virt-specifier final and in a class D derived from B a function D::f overrides B::f, the program is ill-formed.
And here [class/3]:
If a class is marked with the class-virt-specifier final and it appears as a base-type-specifier in a base-clause (Clause [class.derived]), the program is ill-formed.
So, in answer to the question;
Does a final class implicitly imply its virtual functions to be final as well? Should it? Please clarify.
So, at least not formally. Any attempt to violate either rule will have the same result in both cases; the program is ill-formed and won't compile. A final class means the class cannot be derived from, so as a consequence of this, its virtual methods cannot be overridden.
Should it? At least formally, probably not; they are related but they are not the same thing. There is also no need formally require the one to imply the other, the effect follows naturally. Any violations have the same result, a failed compilation (hopefully with appropriate error messages to distinguish the two).
To touch on your motivation for the query and the de-virtualization of the virtual calls. This is not always immediately affected by the final of the class nor method (albeit they offer help), the normal rules of the virtual functions and class hierarchy apply.
If the compiler can determine that at runtime a particular method will always be called (e.g. with an automatic object, i.e. "on the stack"), it could apply such an optimisation anyway, irrespective of the method being marked final or not. These optimisations fall under the "as-if" rule, that allow the compiler to apply any transformation so long as the observable behaviour is as-if the original code had been executed.
Does a final class implicitly imply its virtual functions to be final as well?
[...]
I am asking this is because final functions become qualified for de-virtualization, which is a great optimization.
Yes, it does, for the purposes of de-virtualization, in all major compilers (including MSVC):
struct B { virtual void f() = 0; };
struct D1 : public B { void f(); };
struct D2 : public B { void f() final; };
struct D3 final : public B { void f(); };
void f1(D1& x) { x.f(); } // Not de-virtualized
void f2(D2& x) { x.f(); } // De-virtualized
void f3(D3& x) { x.f(); } // De-virtualized
I got this question when I received a code review comment saying virtual functions need not be inline.
I thought inline virtual functions could come in handy in scenarios where functions are called on objects directly. But the counter-argument came to my mind is -- why would one want to define virtual and then use objects to call methods?
Is it best not to use inline virtual functions, since they're almost never expanded anyway?
Code snippet I used for analysis:
class Temp
{
public:
virtual ~Temp()
{
}
virtual void myVirtualFunction() const
{
cout<<"Temp::myVirtualFunction"<<endl;
}
};
class TempDerived : public Temp
{
public:
void myVirtualFunction() const
{
cout<<"TempDerived::myVirtualFunction"<<endl;
}
};
int main(void)
{
TempDerived aDerivedObj;
//Compiler thinks it's safe to expand the virtual functions
aDerivedObj.myVirtualFunction();
//type of object Temp points to is always known;
//does compiler still expand virtual functions?
//I doubt compiler would be this much intelligent!
Temp* pTemp = &aDerivedObj;
pTemp->myVirtualFunction();
return 0;
}
Virtual functions can be inlined sometimes. An excerpt from the excellent C++ faq:
"The only time an inline virtual call
can be inlined is when the compiler
knows the "exact class" of the object
which is the target of the virtual
function call. This can happen only
when the compiler has an actual object
rather than a pointer or reference to
an object. I.e., either with a local
object, a global/static object, or a
fully contained object inside a
composite."
C++11 has added final. This changes the accepted answer: it's no longer necessary to know the exact class of the object, it's sufficient to know the object has at least the class type in which the function was declared final:
class A {
virtual void foo();
};
class B : public A {
inline virtual void foo() final { }
};
class C : public B
{
};
void bar(B const& b) {
A const& a = b; // Allowed, every B is an A.
a.foo(); // Call to B::foo() can be inlined, even if b is actually a class C.
}
There is one category of virtual functions where it still makes sense to have them inline. Consider the following case:
class Base {
public:
inline virtual ~Base () { }
};
class Derived1 : public Base {
inline virtual ~Derived1 () { } // Implicitly calls Base::~Base ();
};
class Derived2 : public Derived1 {
inline virtual ~Derived2 () { } // Implicitly calls Derived1::~Derived1 ();
};
void foo (Base * base) {
delete base; // Virtual call
}
The call to delete 'base', will perform a virtual call to call correct derived class destructor, this call is not inlined. However because each destructor calls it's parent destructor (which in these cases are empty), the compiler can inline those calls, since they do not call the base class functions virtually.
The same principle exists for base class constructors or for any set of functions where the derived implementation also calls the base classes implementation.
I've seen compilers that don't emit any v-table if no non-inline function at all exists (and defined in one implementation file instead of a header then). They would throw errors like missing vtable-for-class-A or something similar, and you would be confused as hell, as i was.
Indeed, that's not conformant with the Standard, but it happens so consider putting at least one virtual function not in the header (if only the virtual destructor), so that the compiler could emit a vtable for the class at that place. I know it happens with some versions of gcc.
As someone mentioned, inline virtual functions can be a benefit sometimes, but of course most often you will use it when you do not know the dynamic type of the object, because that was the whole reason for virtual in the first place.
The compiler however can't completely ignore inline. It has other semantics apart from speeding up a function-call. The implicit inline for in-class definitions is the mechanism which allows you to put the definition into the header: Only inline functions can be defined multiple times throughout the whole program without a violation any rules. In the end, it behaves as you would have defined it only once in the whole program, even though you included the header multiple times into different files linked together.
Well, actually virtual functions can always be inlined, as long they're statically linked together: suppose we have an abstract class Base with a virtual function F and derived classes Derived1 and Derived2:
class Base {
virtual void F() = 0;
};
class Derived1 : public Base {
virtual void F();
};
class Derived2 : public Base {
virtual void F();
};
An hypotetical call b->F(); (with b of type Base*) is obviously virtual. But you (or the compiler...) could rewrite it like so (suppose typeof is a typeid-like function that returns a value that can be used in a switch)
switch (typeof(b)) {
case Derived1: b->Derived1::F(); break; // static, inlineable call
case Derived2: b->Derived2::F(); break; // static, inlineable call
case Base: assert(!"pure virtual function call!");
default: b->F(); break; // virtual call (dyn-loaded code)
}
while we still need RTTI for the typeof, the call can effectively be inlined by, basically, embedding the vtable inside the instruction stream and specializing the call for all the involved classes. This could be also generalized by specializing only a few classes (say, just Derived1):
switch (typeof(b)) {
case Derived1: b->Derived1::F(); break; // hot path
default: b->F(); break; // default virtual call, cold path
}
inline really doesn't do anything - it's a hint. The compiler might ignore it or it might inline a call event without inline if it sees the implementation and likes this idea. If code clarity is at stake the inline should be removed.
Marking a virtual method inline, helps in further optimizing virtual functions in following two cases:
Curiously recurring template pattern (http://www.codeproject.com/Tips/537606/Cplusplus-Prefer-Curiously-Recurring-Template-Patt)
Replacing virtual methods with templates (http://www.di.unipi.it/~nids/docs/templates_vs_inheritance.html)
Inlined declared Virtual functions are inlined when called through objects and ignored when called via pointer or references.
With modern compilers, it won't do any harm to inlibe them. Some ancient compiler/linker combos might have created multiple vtables, but I don't believe that is an issue anymore.
A compiler can only inline a function when the call can be resolved unambiguously at compile time.
Virtual functions, however are resolved at runtime, and so the compiler cannot inline the call, since at compile type the dynamic type (and therefore the function implementation to be called) cannot be determined.
In the cases where the function call is unambiguous and the function a suitable candidate for inlining, the compiler is smart enough to inline the code anyway.
The rest of the time "inline virtual" is a nonsense, and indeed some compilers won't compile that code.
It does make sense to make virtual functions and then call them on objects rather than references or pointers. Scott Meyer recommends, in his book "effective c++", to never redefine an inherited non-virtual function. That makes sense, because when you make a class with a non-virtual function and redefine the function in a derived class, you may be sure to use it correctly yourself, but you can't be sure others will use it correctly. Also, you may at a later date use it incorrectly yoruself. So, if you make a function in a base class and you want it to be redifinable, you should make it virtual. If it makes sense to make virtual functions and call them on objects, it also makes sense to inline them.
Actually in some cases adding "inline" to a virtual final override can make your code not compile so there is sometimes a difference (at least under VS2017s compiler)!
Actually I was doing a virtual inline final override function in VS2017 adding c++17 standard to compile and link and for some reason it failed when I am using two projects.
I had a test project and an implementation DLL that I am unit testing. In the test project I am having a "linker_includes.cpp" file that #include the *.cpp files from the other project that are needed. I know... I know I can set up msbuild to use the object files from the DLL, but please bear in mind that it is a microsoft specific solution while including the cpp files is unrelated to build-system and much more easier to version a cpp file than xml files and project settings and such...
What was interesting is that I was constantly getting linker error from the test project. Even if I added the definition of the missing functions by copy paste and not through include! So weird. The other project have built and there are no connection between the two other than marking a project reference so there is a build order to ensure both is always built...
I think it is some kind of bug in the compiler. I have no idea if it exists in the compiler shipped with VS2020, because I am using an older version because some SDK only works with that properly :-(
I just wanted to add that not only marking them as inline can mean something, but might even make your code not build in some rare circumstances! This is weird, yet good to know.
PS.: The code I am working on is computer graphics related so I prefer inlining and that is why I used both final and inline. I kept the final specifier to hope the release build is smart enough to build the DLL by inlining it even without me directly hinting so...
PS (Linux).: I expect the same does not happen in gcc or clang as I routinely used to do these kind of things. I am not sure where this issue comes from... I prefer doing c++ on Linux or at least with some gcc, but sometimes project is different in needs.
My library has two classes, a base class and a derived class. In the current version of the library the base class has a virtual function foo(), and the derived class does not override it. In the next version I'd like the derived class to override it. Does this break ABI? I know that introducing a new virtual function usually does, but this seems like a special case. My intuition is that it should be changing an offset in the vtbl, without actually changing the table's size.
Obviously since the C++ standard doesn't mandate a particular ABI this question is somewhat platform specific, but in practice what breaks and maintains ABI is similar across most compilers. I'm interested in GCC's behavior, but the more compilers people can answer for the more useful this question will be ;)
It might.
You're wrong regarding the offset. The offset in the vtable is determined already. What will happen is that the Derived class constructor will replace the function pointer at that offset with the Derived override (by switching the in-class v-pointer to a new v-table). So it is, normally, ABI compatible.
There might be an issue though, because of optimization, and especially the devirtualization of function calls.
Normally, when you call a virtual function, the compiler introduces a lookup in the vtable via the vpointer. However, if it can deduce (statically) what the exact type of the object is, it can also deduce the exact function to call and shave off the virtual lookup.
Example:
struct Base {
virtual void foo();
virtual void bar();
};
struct Derived: Base {
virtual void foo();
};
int main(int argc, char* argv[]) {
Derived d;
d.foo(); // It is necessarily Derived::foo
d.bar(); // It is necessarily Base::bar
}
And in this case... simply linking with your new library will not pick up Derived::bar.
This doesn't seem like something that could be particularly relied on in general - as you said C++ ABI is pretty tricky (even down to compiler options).
That said I think you could use g++ -fdump-class-hierarchy before and after you made the change to see if either the parent or child vtables change in structure. If they don't it's probably "fairly" safe to assume you didn't break ABI.
Yes, in some situations, adding a reimplementation of a virtual function will change the layout of the virtual function table. That is the case if you're reimplementing a virtual function from a base that isn't the first base class (multiple-inheritance):
// V1
struct A { virtual void f(); };
struct B { virtual void g(); };
struct C : A, B { virtual void h(); }; //does not reimplement f or g;
// V2
struct C : A, B {
virtual void h();
virtual void g(); //added reimplementation of g()
};
This changes the layout of C's vtable by adding an entry for g() (thanks to "Gof" for bringing this to my attention in the first place, as a comment in http://marcmutz.wordpress.com/2010/07/25/bcsc-gotcha-reimplementing-a-virtual-function/).
Also, as mentioned elsewhere, you get a problem if the class you're overriding the function in is used by users of your library in a way where the static type is equal to the dynamic type. This can be the case after you new'ed it:
MyClass * c = new MyClass;
c->myVirtualFunction(); // not actually virtual at runtime
or created it on the stack:
MyClass c;
c.myVirtualFunction(); // not actually virtual at runtime
The reason for this is an optimisation called "de-virtualisation". If the compiler can prove, at compile time, what the dynamic type of the object is, it will not emit the indirection through the virtual function table, but instead call the correct function directly.
Now, if users compiled against an old version of you library, the compiler will have inserted a call to the most-derived reimplementation of the virtual method. If, in a newer version of your library, you override this virtual function in a more-derived class, code compiled against the old library will still call the old function, whereas new code or code where the compiler could not prove the dynamic type of the object at compile time, will go through the virtual function table. So, a given instance of the class may be confronted, at runtime, with calls to the base class' function that it cannot intercept, potentially creating violations of class invariants.
My intuition is that it should be changing an offset in the vtbl, without actually changing the table's size.
Well, your intuition is clearly wrong:
either there is a new entry in the vtable for the overrider, all following entries are moved, and the table grows,
or there is no new entry, and the vtable representation does not change.
Which one is true can depends on many factors.
Anyway: do not count on it.
Caution: see In C++, does overriding an existing virtual function break ABI? for a case where this logic doesn't hold true;
In my mind Mark's suggestion to use g++ -fdump-class-hierarchy would be the winner here, right after having proper regression tests
Overriding things should not change vtable layout[1]. The vtable entries itself would be in the datasegment of the library, IMHO, so a change to it should not pose a problem.
Of course, the applications need to be relinked, otherwise there is a potential for breakage if the consumer had been using direct reference to &Derived::overriddenMethod;
I'm not sure whether a compiler would have been allowed to resolve that to &Base::overriddenMethod at all, but better safe than sorry.
[1] spelling it out: this presumes that the method was virtual to begin with!