Why does this compile, but not link? - c++

#include <iostream>
using namespace std;
class C
{
public:
virtual void a();
};
class D : public C
{
public:
void a() { cout<<"D::a\n"; }
void b() { cout<<"D::b\n"; }
};
int main()
{
D a;
a.b();
return 0;
}
I am getting a link error about undefined reference to 'vtable for C'. What does this mean and why is it?
I know the problem is obviously that the base class has a non-pure virtual function that is never defined, but why does this bother the linker if I am never calling it? Why is it different then any other function that I declare and don't define, that if I never call it I am fine?
I am interested in the nitty-gritty details.

Most implementations of C++ compilers generate a vtable for each class, which is a table of function pointers for the virtual functions. Like any other data item, there can only be one definition of the vtable. Some C++ compilers generate this vtable when compiling the implementation of the first declared virtual function within a type (this guarantees that there is only one definition of the vtable). If you fail to provide an implementation for the first virtual function, your compiler does not generate the vtable and the linker complains about the missing vtable with a link error.
As you can see, the precise details of this depends on the implementation of your chosen compiler and linker. Not all toolchains are the same.

Q: I know the problem is obviously that the base class has a non-pure virtual function that is never defined
A: That's the answer to your question :)
Q: Why is it different then any other function that I declare and don't define, that if I never call it I am fine?
A: Because it's not just a "function". It's a virtual class method.
SUGGESTION:
Declare three different classes:
1) a simple method
2) a virtual method (as your "C" above)
3) an abstract virtual method ( = 0)
Generate assembly output (e.g. "-S" for GCC)
Compare the three cases. Carefully note if a *constructor" is created :)

If you want C to be a pure virtual base-class, you need to tell the compiler "there will be no implementation of a()", you do this by usring ... a() = 0; to the declaration of the class. The compiler is trying to fing a base-class vtable, but there isn't one!

The need for actual function definition can arise in several circumstances. You already named one of such circumstances: the function has to be defined if you call it. However, this is not all. Consider this piece of code
void foo();
int main() {
void (*p)() = &foo;
}
This code never calls foo. However, if you try to compile it, a typical compiler will complain about missing definition of foo at linking stage. So, taking the address of a function (even if you never call it) happens to be one of the situations when the function has to be defined.
And this is where the matter of virtual functions come in. Virtual functions are typically implemented through a virtual table: a table that contains pointers to the definitions of the virtual functions. This table is formed and initialized by the compiler in advance, unconditionally, regardless of whether you actually call the functions or not. The compiler will essentially implicitly take the address of each non-pure virtual function in your program and place it into the corresponding table. For this reason, each non-pure virtual function will require a definition even if you never call it.
The exact virtual mechanism is an implementation detail, so you might end up with different specific errors is your situation (like a missing function definition or missing virtual table itself), but the fundamental reason for these error is what I described above.

Related

How can a destructor be both virtual and inline in C++? [duplicate]

I got this question when I received a code review comment saying virtual functions need not be inline.
I thought inline virtual functions could come in handy in scenarios where functions are called on objects directly. But the counter-argument came to my mind is -- why would one want to define virtual and then use objects to call methods?
Is it best not to use inline virtual functions, since they're almost never expanded anyway?
Code snippet I used for analysis:
class Temp
{
public:
virtual ~Temp()
{
}
virtual void myVirtualFunction() const
{
cout<<"Temp::myVirtualFunction"<<endl;
}
};
class TempDerived : public Temp
{
public:
void myVirtualFunction() const
{
cout<<"TempDerived::myVirtualFunction"<<endl;
}
};
int main(void)
{
TempDerived aDerivedObj;
//Compiler thinks it's safe to expand the virtual functions
aDerivedObj.myVirtualFunction();
//type of object Temp points to is always known;
//does compiler still expand virtual functions?
//I doubt compiler would be this much intelligent!
Temp* pTemp = &aDerivedObj;
pTemp->myVirtualFunction();
return 0;
}
Virtual functions can be inlined sometimes. An excerpt from the excellent C++ faq:
"The only time an inline virtual call
can be inlined is when the compiler
knows the "exact class" of the object
which is the target of the virtual
function call. This can happen only
when the compiler has an actual object
rather than a pointer or reference to
an object. I.e., either with a local
object, a global/static object, or a
fully contained object inside a
composite."
C++11 has added final. This changes the accepted answer: it's no longer necessary to know the exact class of the object, it's sufficient to know the object has at least the class type in which the function was declared final:
class A {
virtual void foo();
};
class B : public A {
inline virtual void foo() final { }
};
class C : public B
{
};
void bar(B const& b) {
A const& a = b; // Allowed, every B is an A.
a.foo(); // Call to B::foo() can be inlined, even if b is actually a class C.
}
There is one category of virtual functions where it still makes sense to have them inline. Consider the following case:
class Base {
public:
inline virtual ~Base () { }
};
class Derived1 : public Base {
inline virtual ~Derived1 () { } // Implicitly calls Base::~Base ();
};
class Derived2 : public Derived1 {
inline virtual ~Derived2 () { } // Implicitly calls Derived1::~Derived1 ();
};
void foo (Base * base) {
delete base; // Virtual call
}
The call to delete 'base', will perform a virtual call to call correct derived class destructor, this call is not inlined. However because each destructor calls it's parent destructor (which in these cases are empty), the compiler can inline those calls, since they do not call the base class functions virtually.
The same principle exists for base class constructors or for any set of functions where the derived implementation also calls the base classes implementation.
I've seen compilers that don't emit any v-table if no non-inline function at all exists (and defined in one implementation file instead of a header then). They would throw errors like missing vtable-for-class-A or something similar, and you would be confused as hell, as i was.
Indeed, that's not conformant with the Standard, but it happens so consider putting at least one virtual function not in the header (if only the virtual destructor), so that the compiler could emit a vtable for the class at that place. I know it happens with some versions of gcc.
As someone mentioned, inline virtual functions can be a benefit sometimes, but of course most often you will use it when you do not know the dynamic type of the object, because that was the whole reason for virtual in the first place.
The compiler however can't completely ignore inline. It has other semantics apart from speeding up a function-call. The implicit inline for in-class definitions is the mechanism which allows you to put the definition into the header: Only inline functions can be defined multiple times throughout the whole program without a violation any rules. In the end, it behaves as you would have defined it only once in the whole program, even though you included the header multiple times into different files linked together.
Well, actually virtual functions can always be inlined, as long they're statically linked together: suppose we have an abstract class Base with a virtual function F and derived classes Derived1 and Derived2:
class Base {
virtual void F() = 0;
};
class Derived1 : public Base {
virtual void F();
};
class Derived2 : public Base {
virtual void F();
};
An hypotetical call b->F(); (with b of type Base*) is obviously virtual. But you (or the compiler...) could rewrite it like so (suppose typeof is a typeid-like function that returns a value that can be used in a switch)
switch (typeof(b)) {
case Derived1: b->Derived1::F(); break; // static, inlineable call
case Derived2: b->Derived2::F(); break; // static, inlineable call
case Base: assert(!"pure virtual function call!");
default: b->F(); break; // virtual call (dyn-loaded code)
}
while we still need RTTI for the typeof, the call can effectively be inlined by, basically, embedding the vtable inside the instruction stream and specializing the call for all the involved classes. This could be also generalized by specializing only a few classes (say, just Derived1):
switch (typeof(b)) {
case Derived1: b->Derived1::F(); break; // hot path
default: b->F(); break; // default virtual call, cold path
}
inline really doesn't do anything - it's a hint. The compiler might ignore it or it might inline a call event without inline if it sees the implementation and likes this idea. If code clarity is at stake the inline should be removed.
Marking a virtual method inline, helps in further optimizing virtual functions in following two cases:
Curiously recurring template pattern (http://www.codeproject.com/Tips/537606/Cplusplus-Prefer-Curiously-Recurring-Template-Patt)
Replacing virtual methods with templates (http://www.di.unipi.it/~nids/docs/templates_vs_inheritance.html)
Inlined declared Virtual functions are inlined when called through objects and ignored when called via pointer or references.
With modern compilers, it won't do any harm to inlibe them. Some ancient compiler/linker combos might have created multiple vtables, but I don't believe that is an issue anymore.
A compiler can only inline a function when the call can be resolved unambiguously at compile time.
Virtual functions, however are resolved at runtime, and so the compiler cannot inline the call, since at compile type the dynamic type (and therefore the function implementation to be called) cannot be determined.
In the cases where the function call is unambiguous and the function a suitable candidate for inlining, the compiler is smart enough to inline the code anyway.
The rest of the time "inline virtual" is a nonsense, and indeed some compilers won't compile that code.
It does make sense to make virtual functions and then call them on objects rather than references or pointers. Scott Meyer recommends, in his book "effective c++", to never redefine an inherited non-virtual function. That makes sense, because when you make a class with a non-virtual function and redefine the function in a derived class, you may be sure to use it correctly yourself, but you can't be sure others will use it correctly. Also, you may at a later date use it incorrectly yoruself. So, if you make a function in a base class and you want it to be redifinable, you should make it virtual. If it makes sense to make virtual functions and call them on objects, it also makes sense to inline them.
Actually in some cases adding "inline" to a virtual final override can make your code not compile so there is sometimes a difference (at least under VS2017s compiler)!
Actually I was doing a virtual inline final override function in VS2017 adding c++17 standard to compile and link and for some reason it failed when I am using two projects.
I had a test project and an implementation DLL that I am unit testing. In the test project I am having a "linker_includes.cpp" file that #include the *.cpp files from the other project that are needed. I know... I know I can set up msbuild to use the object files from the DLL, but please bear in mind that it is a microsoft specific solution while including the cpp files is unrelated to build-system and much more easier to version a cpp file than xml files and project settings and such...
What was interesting is that I was constantly getting linker error from the test project. Even if I added the definition of the missing functions by copy paste and not through include! So weird. The other project have built and there are no connection between the two other than marking a project reference so there is a build order to ensure both is always built...
I think it is some kind of bug in the compiler. I have no idea if it exists in the compiler shipped with VS2020, because I am using an older version because some SDK only works with that properly :-(
I just wanted to add that not only marking them as inline can mean something, but might even make your code not build in some rare circumstances! This is weird, yet good to know.
PS.: The code I am working on is computer graphics related so I prefer inlining and that is why I used both final and inline. I kept the final specifier to hope the release build is smart enough to build the DLL by inlining it even without me directly hinting so...
PS (Linux).: I expect the same does not happen in gcc or clang as I routinely used to do these kind of things. I am not sure where this issue comes from... I prefer doing c++ on Linux or at least with some gcc, but sometimes project is different in needs.

Performance: Inheritance and inlining? [duplicate]

I got this question when I received a code review comment saying virtual functions need not be inline.
I thought inline virtual functions could come in handy in scenarios where functions are called on objects directly. But the counter-argument came to my mind is -- why would one want to define virtual and then use objects to call methods?
Is it best not to use inline virtual functions, since they're almost never expanded anyway?
Code snippet I used for analysis:
class Temp
{
public:
virtual ~Temp()
{
}
virtual void myVirtualFunction() const
{
cout<<"Temp::myVirtualFunction"<<endl;
}
};
class TempDerived : public Temp
{
public:
void myVirtualFunction() const
{
cout<<"TempDerived::myVirtualFunction"<<endl;
}
};
int main(void)
{
TempDerived aDerivedObj;
//Compiler thinks it's safe to expand the virtual functions
aDerivedObj.myVirtualFunction();
//type of object Temp points to is always known;
//does compiler still expand virtual functions?
//I doubt compiler would be this much intelligent!
Temp* pTemp = &aDerivedObj;
pTemp->myVirtualFunction();
return 0;
}
Virtual functions can be inlined sometimes. An excerpt from the excellent C++ faq:
"The only time an inline virtual call
can be inlined is when the compiler
knows the "exact class" of the object
which is the target of the virtual
function call. This can happen only
when the compiler has an actual object
rather than a pointer or reference to
an object. I.e., either with a local
object, a global/static object, or a
fully contained object inside a
composite."
C++11 has added final. This changes the accepted answer: it's no longer necessary to know the exact class of the object, it's sufficient to know the object has at least the class type in which the function was declared final:
class A {
virtual void foo();
};
class B : public A {
inline virtual void foo() final { }
};
class C : public B
{
};
void bar(B const& b) {
A const& a = b; // Allowed, every B is an A.
a.foo(); // Call to B::foo() can be inlined, even if b is actually a class C.
}
There is one category of virtual functions where it still makes sense to have them inline. Consider the following case:
class Base {
public:
inline virtual ~Base () { }
};
class Derived1 : public Base {
inline virtual ~Derived1 () { } // Implicitly calls Base::~Base ();
};
class Derived2 : public Derived1 {
inline virtual ~Derived2 () { } // Implicitly calls Derived1::~Derived1 ();
};
void foo (Base * base) {
delete base; // Virtual call
}
The call to delete 'base', will perform a virtual call to call correct derived class destructor, this call is not inlined. However because each destructor calls it's parent destructor (which in these cases are empty), the compiler can inline those calls, since they do not call the base class functions virtually.
The same principle exists for base class constructors or for any set of functions where the derived implementation also calls the base classes implementation.
I've seen compilers that don't emit any v-table if no non-inline function at all exists (and defined in one implementation file instead of a header then). They would throw errors like missing vtable-for-class-A or something similar, and you would be confused as hell, as i was.
Indeed, that's not conformant with the Standard, but it happens so consider putting at least one virtual function not in the header (if only the virtual destructor), so that the compiler could emit a vtable for the class at that place. I know it happens with some versions of gcc.
As someone mentioned, inline virtual functions can be a benefit sometimes, but of course most often you will use it when you do not know the dynamic type of the object, because that was the whole reason for virtual in the first place.
The compiler however can't completely ignore inline. It has other semantics apart from speeding up a function-call. The implicit inline for in-class definitions is the mechanism which allows you to put the definition into the header: Only inline functions can be defined multiple times throughout the whole program without a violation any rules. In the end, it behaves as you would have defined it only once in the whole program, even though you included the header multiple times into different files linked together.
Well, actually virtual functions can always be inlined, as long they're statically linked together: suppose we have an abstract class Base with a virtual function F and derived classes Derived1 and Derived2:
class Base {
virtual void F() = 0;
};
class Derived1 : public Base {
virtual void F();
};
class Derived2 : public Base {
virtual void F();
};
An hypotetical call b->F(); (with b of type Base*) is obviously virtual. But you (or the compiler...) could rewrite it like so (suppose typeof is a typeid-like function that returns a value that can be used in a switch)
switch (typeof(b)) {
case Derived1: b->Derived1::F(); break; // static, inlineable call
case Derived2: b->Derived2::F(); break; // static, inlineable call
case Base: assert(!"pure virtual function call!");
default: b->F(); break; // virtual call (dyn-loaded code)
}
while we still need RTTI for the typeof, the call can effectively be inlined by, basically, embedding the vtable inside the instruction stream and specializing the call for all the involved classes. This could be also generalized by specializing only a few classes (say, just Derived1):
switch (typeof(b)) {
case Derived1: b->Derived1::F(); break; // hot path
default: b->F(); break; // default virtual call, cold path
}
inline really doesn't do anything - it's a hint. The compiler might ignore it or it might inline a call event without inline if it sees the implementation and likes this idea. If code clarity is at stake the inline should be removed.
Marking a virtual method inline, helps in further optimizing virtual functions in following two cases:
Curiously recurring template pattern (http://www.codeproject.com/Tips/537606/Cplusplus-Prefer-Curiously-Recurring-Template-Patt)
Replacing virtual methods with templates (http://www.di.unipi.it/~nids/docs/templates_vs_inheritance.html)
Inlined declared Virtual functions are inlined when called through objects and ignored when called via pointer or references.
With modern compilers, it won't do any harm to inlibe them. Some ancient compiler/linker combos might have created multiple vtables, but I don't believe that is an issue anymore.
A compiler can only inline a function when the call can be resolved unambiguously at compile time.
Virtual functions, however are resolved at runtime, and so the compiler cannot inline the call, since at compile type the dynamic type (and therefore the function implementation to be called) cannot be determined.
In the cases where the function call is unambiguous and the function a suitable candidate for inlining, the compiler is smart enough to inline the code anyway.
The rest of the time "inline virtual" is a nonsense, and indeed some compilers won't compile that code.
It does make sense to make virtual functions and then call them on objects rather than references or pointers. Scott Meyer recommends, in his book "effective c++", to never redefine an inherited non-virtual function. That makes sense, because when you make a class with a non-virtual function and redefine the function in a derived class, you may be sure to use it correctly yourself, but you can't be sure others will use it correctly. Also, you may at a later date use it incorrectly yoruself. So, if you make a function in a base class and you want it to be redifinable, you should make it virtual. If it makes sense to make virtual functions and call them on objects, it also makes sense to inline them.
Actually in some cases adding "inline" to a virtual final override can make your code not compile so there is sometimes a difference (at least under VS2017s compiler)!
Actually I was doing a virtual inline final override function in VS2017 adding c++17 standard to compile and link and for some reason it failed when I am using two projects.
I had a test project and an implementation DLL that I am unit testing. In the test project I am having a "linker_includes.cpp" file that #include the *.cpp files from the other project that are needed. I know... I know I can set up msbuild to use the object files from the DLL, but please bear in mind that it is a microsoft specific solution while including the cpp files is unrelated to build-system and much more easier to version a cpp file than xml files and project settings and such...
What was interesting is that I was constantly getting linker error from the test project. Even if I added the definition of the missing functions by copy paste and not through include! So weird. The other project have built and there are no connection between the two other than marking a project reference so there is a build order to ensure both is always built...
I think it is some kind of bug in the compiler. I have no idea if it exists in the compiler shipped with VS2020, because I am using an older version because some SDK only works with that properly :-(
I just wanted to add that not only marking them as inline can mean something, but might even make your code not build in some rare circumstances! This is weird, yet good to know.
PS.: The code I am working on is computer graphics related so I prefer inlining and that is why I used both final and inline. I kept the final specifier to hope the release build is smart enough to build the DLL by inlining it even without me directly hinting so...
PS (Linux).: I expect the same does not happen in gcc or clang as I routinely used to do these kind of things. I am not sure where this issue comes from... I prefer doing c++ on Linux or at least with some gcc, but sometimes project is different in needs.

Understanding linker errors

I wrote the following program with virtual functions:
struct A
{
virtual void foo() = 0;
A(){ init(); }
void init(){ foo(); }
};
struct B : A
{
virtual void foo(){ }
};
B a;
int main(){
return 0;
}
DEMO
I thought some linker-error should be ccured becuase there's no implementation of the foo was found. We got runtime error instead. Why? Why not the linker error?
The first thing you have to understand here is that a call to foo() made while the constructor of class A is active is dispatched to A::foo(), even if the full object under construction has type B and B overrides foo(). The presence of B::foo() is simply ignored.
This means that your code attempts to call A::foo(). Since A::foo() is a pure virtual function, the behavior of your code is undefined.
C++ language make no guarantees of what kind of "error" should occur in such cases. Which means that your expectations of "linker error" are completely unfounded. If a programs makes an attempt to perform a virtual call to a pure virtual function the behavior is simply undefined. That is the only thing that can be said here from the C++ language point of view.
How this undefined behavior will manifest itself in practical implementations depends on the implementation. Undefined behavior is allowed to manifest itself through compile-time errors, for example.
In your case, your program attempts to make a virtual call to pure virtual function A::foo(). In general case the compiler dispatches virtual calls dynamically, through a run-time mechanism that implements polymorphism (so called virtual method table is the most popular one). In some cases, when compiler can determine the exact type of the object used in the call, it optimizes the code and makes an ordinary direct (non-dynamic) call to a virtual function.
In practice, if a function pure virtual, its virtual method table entry contains a null pointer. A dynamic call to such function typically leads to run-time error. Meanwhile, a direct (optimized) call to such function typically leads to a compiler or linker error.
In your example the compiler did not optimize the call. It made a full-fledged dynamic call to A::foo() through the virtual method table. The null pointer in that table triggered the run-time error.
If you call your pure virtual function directly from the constructor
A() { foo(); }
a typical compiler will normally make a direct (optimized) call to foo(), which will typically lead to a linker error.
B does have an implementation of foo so there's no problem for the linker.
As far as I know, the fact that A is calling foo at a bad time is something the compiler/linker isn't required to figure out. (And although it might be simple to do such a check in this case, I'm sure we could come up with much more complicated cases that would be harder or perhaps impossible to catch.)
Your error is result of calling virtual functions from within constructors. The function called is the function in A, not more derived functions. C++ standard, section 12.7.4 states,
Member functions, including virtual functions (10.3), can be called
during construction or destruction (12.6.2). When a virtual function
is called directly or indirectly from a constructor or from a
destructor, including during the construction or destruction of the
classes non-static data members, and the object to which the call
applies is the object (call it x) under construction or destruction,
the function called is the final overrider in the constructor's or
destructor's class and not one overriding it in a more-derived class.
If the virtual function call uses an explicit class member access
(5.2.5) and the object expression refers to the complete object of x
or one of that object's base class subobjects but not x or one of its
base class subobjects, the behavior is undefined.
Now, you are cheating in your example. You are calling a normal function from your constructor and then a virtual function from your normal function. Change your code to,
struct A
{
virtual void foo() = 0;
A(){ foo(); }
};
and you'll get your error,
warning: pure virtual ‘virtual void A::foo()’ called from constructor [enabled by default]
_ZN1AC2Ev[_ZN1AC5Ev]+0x1f): undefined reference to `A::foo()'

C++ : How does Virtual Function resolve "this" pointer scope issue?

(C++,MinGW 4.4.0,Windows OS)
All that is commented in the code, except labels <1> and <2>, is my guess. Please correct me in case you think I'm wrong somewhere:
class A {
public:
virtual void disp(); //not necessary to define as placeholder in vtable entry will be
//overwritten when derived class's vtable entry is prepared after
//invoking Base ctor (unless we do new A instead of new B in main() below)
};
class B :public A {
public:
B() : x(100) {}
void disp() {std::printf("%d",x);}
int x;
};
int main() {
A* aptr=new B; //memory model and vtable of B (say vtbl_B) is assigned to aptr
aptr->disp(); //<1> no error
std::printf("%d",aptr->x); //<2> error -> A knows nothing about x
}
<2> is an error and is obvious. Why <1> is not an error? What I think is happening for this invocation is: aptr->disp(); --> (*aptr->*(vtbl_B + offset to disp))(aptr) aptr in the parameter being the implicit this pointer to the member function. Inside disp() we would have std::printf("%d",x); --> std::printf("%d",aptr->x); SAME AS std::printf("%d",this->x); So why does <1> give no error while <2> does?
(I know vtables are implementation specific and stuff but I still think it's worth asking the question)
this is not the same as aptr inside B::disp. The B::disp implementation takes this as B*, just like any other method of B. When you invoke virtual method via A* pointer, it is converted to B* first (which may even change its value so it is not necessarily equal to aptr during the call).
I.e. what really happens is something like
typedef void (A::*disp_fn_t)();
disp_fn_t methodPtr = aptr->vtable[index_of_disp]; // methodPtr == &B::disp
B* b = static_cast<B*>(aptr);
(b->*methodPtr)(); // same as b->disp()
For more complicated example, check this post http://blogs.msdn.com/b/oldnewthing/archive/2004/02/06/68695.aspx. Here, if there are multiple A bases which may invoke the same B::disp, MSVC generates different entry points with each one shifting A* pointer by different offset. This is implementation-specific, of course; other compilers may choose to store the offset somewhere in vtable for example.
The rule is:
In C++ dynamic dispatch only works for member functions functions not for member variables.
For a member variable the compiler only looksup for the symbol name in that particular class or its base classes.
In case 1, the appropriate method to be called is decided by fetching the vpt, fetching the address of the appropriate method and then calling the appropiate member function.
Thus dynamic dispatch is essentially a fetch-fetch-call instead of a normal call in case of static binding.
In Case 2: The compiler only looks for x in the scope of this Obviously, it cannot find it and reports the error.
You are confused, and it seems to me that you come from more dynamic languages.
In C++, compilation and runtime are clearly isolated. A program must first be compiled and then can be run (and any of those steps may fail).
So, going backward:
<2> fails at compilation, because compilation is about static information. aptr is of type A*, thus all methods and attributes of A are accessible through this pointer. Since you declared disp() but no x, then the call to disp() compiles but there is no x.
Therefore, <2>'s failure is about semantics, and those are defined in the C++ Standard.
Getting to <1>, it works because there is a declaration of disp() in A. This guarantees the existence of the function (I would remark that you actually lie here, because you did not defined it in A).
What happens at runtime is semantically defined by the C++ Standard, but the Standard provides no implementation guidance. Most (if not all) C++ compilers will use a virtual table per class + virtual pointer per instance strategy, and your description looks correct in this case.
However this is pure runtime implementation, and the fact that it runs does not retroactively impact the fact that the program compiled.
virtual void disp(); //not necessary to define as placeholder in vtable entry will be
//overwritten when derived class's vtable entry is prepared after
//invoking Base ctor (unless we do new A instead of new B in main() below)
Your comment is not strictly correct. A virtual function is odr-used unless it is pure (the converse does not necessarily hold) which means that you must provide a definition for it. If you don't want to provide a definition for it you must make it a pure virtual function.
If you make one of these modifications then aptr->disp(); works and calls the derived class disp() because disp() in the derived class overrides the base class function. The base class function still has to exist as you are calling it through a pointer to base. x is not a member of the base class so aptr->x is not a valid expression.

In C++, does overriding an existing virtual function break ABI?

My library has two classes, a base class and a derived class. In the current version of the library the base class has a virtual function foo(), and the derived class does not override it. In the next version I'd like the derived class to override it. Does this break ABI? I know that introducing a new virtual function usually does, but this seems like a special case. My intuition is that it should be changing an offset in the vtbl, without actually changing the table's size.
Obviously since the C++ standard doesn't mandate a particular ABI this question is somewhat platform specific, but in practice what breaks and maintains ABI is similar across most compilers. I'm interested in GCC's behavior, but the more compilers people can answer for the more useful this question will be ;)
It might.
You're wrong regarding the offset. The offset in the vtable is determined already. What will happen is that the Derived class constructor will replace the function pointer at that offset with the Derived override (by switching the in-class v-pointer to a new v-table). So it is, normally, ABI compatible.
There might be an issue though, because of optimization, and especially the devirtualization of function calls.
Normally, when you call a virtual function, the compiler introduces a lookup in the vtable via the vpointer. However, if it can deduce (statically) what the exact type of the object is, it can also deduce the exact function to call and shave off the virtual lookup.
Example:
struct Base {
virtual void foo();
virtual void bar();
};
struct Derived: Base {
virtual void foo();
};
int main(int argc, char* argv[]) {
Derived d;
d.foo(); // It is necessarily Derived::foo
d.bar(); // It is necessarily Base::bar
}
And in this case... simply linking with your new library will not pick up Derived::bar.
This doesn't seem like something that could be particularly relied on in general - as you said C++ ABI is pretty tricky (even down to compiler options).
That said I think you could use g++ -fdump-class-hierarchy before and after you made the change to see if either the parent or child vtables change in structure. If they don't it's probably "fairly" safe to assume you didn't break ABI.
Yes, in some situations, adding a reimplementation of a virtual function will change the layout of the virtual function table. That is the case if you're reimplementing a virtual function from a base that isn't the first base class (multiple-inheritance):
// V1
struct A { virtual void f(); };
struct B { virtual void g(); };
struct C : A, B { virtual void h(); }; //does not reimplement f or g;
// V2
struct C : A, B {
virtual void h();
virtual void g(); //added reimplementation of g()
};
This changes the layout of C's vtable by adding an entry for g() (thanks to "Gof" for bringing this to my attention in the first place, as a comment in http://marcmutz.wordpress.com/2010/07/25/bcsc-gotcha-reimplementing-a-virtual-function/).
Also, as mentioned elsewhere, you get a problem if the class you're overriding the function in is used by users of your library in a way where the static type is equal to the dynamic type. This can be the case after you new'ed it:
MyClass * c = new MyClass;
c->myVirtualFunction(); // not actually virtual at runtime
or created it on the stack:
MyClass c;
c.myVirtualFunction(); // not actually virtual at runtime
The reason for this is an optimisation called "de-virtualisation". If the compiler can prove, at compile time, what the dynamic type of the object is, it will not emit the indirection through the virtual function table, but instead call the correct function directly.
Now, if users compiled against an old version of you library, the compiler will have inserted a call to the most-derived reimplementation of the virtual method. If, in a newer version of your library, you override this virtual function in a more-derived class, code compiled against the old library will still call the old function, whereas new code or code where the compiler could not prove the dynamic type of the object at compile time, will go through the virtual function table. So, a given instance of the class may be confronted, at runtime, with calls to the base class' function that it cannot intercept, potentially creating violations of class invariants.
My intuition is that it should be changing an offset in the vtbl, without actually changing the table's size.
Well, your intuition is clearly wrong:
either there is a new entry in the vtable for the overrider, all following entries are moved, and the table grows,
or there is no new entry, and the vtable representation does not change.
Which one is true can depends on many factors.
Anyway: do not count on it.
Caution: see In C++, does overriding an existing virtual function break ABI? for a case where this logic doesn't hold true;
In my mind Mark's suggestion to use g++ -fdump-class-hierarchy would be the winner here, right after having proper regression tests
Overriding things should not change vtable layout[1]. The vtable entries itself would be in the datasegment of the library, IMHO, so a change to it should not pose a problem.
Of course, the applications need to be relinked, otherwise there is a potential for breakage if the consumer had been using direct reference to &Derived::overriddenMethod;
I'm not sure whether a compiler would have been allowed to resolve that to &Base::overriddenMethod at all, but better safe than sorry.
[1] spelling it out: this presumes that the method was virtual to begin with!