When do we break binary compatibility - c++

I was under the impression that whenever you do one of these:
Add a new public virtual method virtual void aMethod();
Add a new public non-virtual method void aMethod();
Implement a public pure-virtual method from an interface virtual void aMethod override;
Was actually breaking binary compatibility, meaning that if a project had build on a previous version of the DLL, it would not be able to load it now that there is new methods available.
From what I have tested using Visual Studio 2012, none of these break anything. Dependency Walker reports no error and my test application was calling the appropriate method.
DLL:
class EXPORT_LIB MyClass {
public:
void saySomething();
}
Executable:
int _tmain(int argc, _TCHAR* argv[])
{
MyClass wTest;
wTest.saySomething();
return 0;
}
The only undefined behavior I found was if MyClass was implementing an pure-virtual interface and from my executable, I was calling one of the pure-virtual method and then I added a new pure-virtual method before the one used by my executable. In this case, Dependency Walker did not report any error but at runtime, it was actually calling the wrong method.
class IMyInterface {
public:
virtual void foo();
}
In the executable
IMyInterface* wTest = new MyClass();
wTest->foo();
Then I change the interface without rebuilding my executable
class IMyInterface {
public:
virtual void bar();
virtual void foo();
}
It is now quietly calling bar() instead of foo().
Is it safe to do all of my three assumptions?
EDIT:
Doing this
class EXPORT_LIB MyClass {
public:
virtual void saySomething();
}
Exec
MyClass wTest;
wTest.saySomething();
Then rebuild DLL with this:
class EXPORT_LIB MyClass {
public:
virtual void saySomething2();
virtual void saySomething();
virtual void saySomething3();
}
Is calling the appropriate saySomething()

Breaking binary compatibility doesn't always result in the DLL not loading, in many cases you'll end up with memory corruption which may or may not be immediately obvious. It depends a lot on the specifics of what you've changed and how things were and now are laid out in memory.
Binary compatibility between DLLs is a complex subject. Lets start by looking at your three examples;
Add a new public virtual method virtual void aMethod();
This almost certainly will result in undefined behaviour, it's very much compiler dependant but most compilers will use some form of vtable for virtual methods, so adding new ones will change the layout of that table.
Add a new public non-virtual method void aMethod();
This is fine for a global function or a member function. A member function is essentially just a global function with a hidden 'this' argument. It doesn't change the memory layout of anything.
Implement a public pure-virtual method from an interface virtual void aMethod override;
This won't exactly cause any undefined behaviour but as you've found, it won't do what you expect. Code that was compiled against the previous version of the library won't know this function has been overridden, so will not call the new implementation, it'll carry on calling the old impl. This may or may not be a problem depending on your use case, it shouldn't cause any other side effects. However I think your mileage could vary here depending on what compiler you're using. So it's probably best to avoid this.
What will stop a DLL from being loaded is if you change the signature of an exported function in any way (including changing parameters and scope) or if you remove a function. As then the dynamic linker won't be able to find it. This only applies if the function in question is being used as the linker only imports functions that are referenced in the code.
There are also many more ways to break binary compatibility between dlls, which are beyond the scope of this answer. In my experience they usually follow a theme of changing the size or layout of something in memory.
Edit: I just remembered that there is an excellent article on the KDE Wiki on binary compatibility in C++ including a very good list of do's and don'ts with explanations and work arounds.

C++ doesn't say.
Visual Studio generally follows COM rules, allowing you to add virtual methods to the end of your most derived class unless they are overloads.
Any non-static data member will change the binary layout as well.
Non-virtual functions don't affect binary compatibility.
Templates make a huge mess because of name mangling.
Your best bet to retain binary compatibility is to use both the pimpl idiom and the nvi idiom quite liberally.

Related

Overriding C++ pure virtual functions

I have the following classes:
Mode class
h-file:
#pragma once
class Mode
{
public:
virtual int recv() = 0;
};
Mode class
cpp-file:
-> empty
LocalMode class
h-file:
#pragma once
#include "Mode.h"
class LocalMode: public Mode
{
private:
public:
LocalMode();
int recv();
};
LocalMode class
cpp-file:
#include "LocalMode.h"
int LocalMode::recv(){
return 0;
}
Here are my questions:
Is the keyword "override" always necessary? If not, what are the best practices?
The main question:
I know the code above works for me. But I have the problem, that I basically have to "copy" the function signature of the pure virtual function out of the base class into my derived class. What happens, if I don't know what pure virtual functions the base class has?
My implementation above implies that I have to know all the pure virtual functions available in the base class.
I tried accessing the pure virtual function by the Mode:: scope and LocalMode:: scope but in Visual Studio I simply got some error messages (I deem those error messages to be rather irrelevant to this question).
Some plug-ins/Intellisense?
I remember that in java IntelliSense in general helped me out and added the needed functions from the abtract class. Although I am aware that java is a bit different in that sense (inheriting from abtract classes) to c++, I would also like if there are any tools which help me to include those automatically?
Skiming through the internet I could not find any examples. They all assume, that all the pure virtual functions of the base class are known ...
I am just imaging that if I had a abstract class with a lot of pure virtual functions and I forgot to copy just one of them, when instantiating I would get an error ...
Thank you in advance.
Is the keyword "override" always necessary? If not, what are the best practices?
It is never "necessary". Overriding works with or without this keyword. It is only here to help you prevent issues like typos etc.
struct A
{
virtual int foo();
};
struct B: public A
{
int fooo(); //whoops, not overriding, no compiler error
};
struct C: public A
{
int fooo() override; //compiler error, compiler noticed my typo
};
So, the best practice is to always use override keyword when you want to override a virtual function.
The main question: I know the code above works for me. But I have the problem, that I basically have to "copy" the function signature of the pure virtual function out of the base class into my derived class. What happens, if I don't know what pure virtual functions the base class has?
You cannot not know that. To derive from class, compiler requires full definition of that class, which typically means you have #included it (and you have access to that definition).
My implementation above implies that I have to know all the pure virtual functions available in the base class. I tried accessing the pure virtual function by the Mode:: scope and LocalMode:: scope but in Visual Studio I simply got some error messages (I deem those error messages to be rather irrelevant to this question).
Are you looking for reflection mechanism? It's not present in C++, so you cannot e.g. get a list of function for given class. If you want to call pure virtual function from another function, that cannot work because, well, they are pure virtual.
Some plug-ins/Intellisense?
That is explicitly off-topic for StackOverflow, but there are many IDEs for C++, you shouldn't have any trouble finding them.

C++ Inheritance and dynamic libraries

The idea is the following. I have a library Version 1 with a class that looks as follows:
class MY_EXPORT MyClass
{
public:
virtual void myMethod(int p1);
}
In Version 2, the class was modified to this:
class MY_EXPORT MyClass
{
public:
virtual void myMethod(int p1);
virtual void myMethod2(int p1, int p2);
}
//implementation of myMethod2 in cpp file
void MyClass::myMethod2(int p1, int p2)
{
myMethod(p1);
//...
}
Now imagine a user compiled againts Version 1 of the library, and extended MyClass by overriding myMethod. Now he updates the library to version 2, without recompiling. Let's further assume the dynamic linker still successfully finds the library and loads it.
The question is, if I call the method instance->myMethod2(1, 2); somwhere inside the library, will it work, or will the application crash? In both cases, the class has no members and thus is of the same size.
I don't think there is point to guess if that app will crash or not, behavior is undefined. The application has to be recompiled, since there was ABI change in the library.
When library calls instance->myMethod2(1, 2); it will have to go through virtual table that was created in the application code with the assumption that there is only one virtual method: myMethod. From that point, you get undefined behavior. In short, you have to recompile you application when library ABI changes.
KDE C++ ABI guidelines specifically prohibit such change. Virtual tables of derived classes will not contain addresses for new methods and so virtual calls of those methods on objects of derived classes will crash.
By changing the definition of the class without recompiling, you've violated the One Definition Rule. The user who did not recompile is using the old definition, while your library is using the new definition. This results in undefined behavior.
To see how this might manifest, consider the typical implementation of virtual functions which uses a VTable to dispatch function calls. The library user has derived a class, and this derived class has only one function in the VTable. If a pointer or reference to this class is passed into the library, and the library tries to call the second function, it will attempt to access a VTable entry that doesn't exist. This will almost always result in a crash, although nothing is guaranteed when it comes to undefined behavior.

Is the C++ linker smart about virtual methods used only from one class in a program?

I work on a project with extremely low unit test culture. We have almost zero unit testing and every API is static.
To be able to unit test some of my code I create wrappers like
class ApiWrapper {
virtual int Call(foo, bar) {
return ApiCall(foo, bar);
}
};
Now in my functions instead:
int myfunc() {
APiCall(foo, bar);
}
I do:
int myfunc(ApiWrapper* wrapper) {
wrapper->Call(foo, bar);
}
This way I am able to mock such functionality. The problem is that some colleagues complain that production code should not be affected from testability needs - nonsense I know, but reality.
Anyway, I believe that I read somewhere at some point that compilers are actually smart about replacing unused polymorphic behavior with a direct call ... or if there is no class that overrides a virtual method it becomes "normal".
I experimented, and on gcc 4.8 it does not inline or directly call the virtual method, but instead creates vt.
I tried to google, but I did not find anything about this. Is this a thing or do I misremember ... or I have to do something to explain this to the linker, an optimization flag or something?
Note that while in production this class is final, in the test environment it is not. This is exactly what the linker has to be smart about and detect it.
The C++ compiler will only replace a polymorphic call with a direct call if it knows for certain what the actual type is.
So in the following snippet, it will be optimized:
void f() {
ApiWrapper x;
x.Call(); // Can be replaced
}
But in the general case, it can't:
void f(ApiWrapper* wrapper) {
wrapper->Call(); // Cannot be replaced
}
You also added two conditions to your question:
if there is no class that overrides a virtual method it becomes "normal".
This will not help. Neither the C++ compiler nor the linker will look at the totality of classes to search whether any inheritor exists. It's futile anyway, since you can always dynamically-load an instance of a new class.
By the way, this optimization is indeed performed by some JVMs (called devirtualization) - since in Java land there's a class loader which is aware of which classes are currently loaded.
in production this class is final
That will help! Clang, for example, will convert virtual calls to non-virtual calls if the method / method's class is marked final.

How are C++ vtable methods ordered *In Practice*

In theory, C++ does not have a binary interface, and the order of methods in the vtable is undefined. Change anything about a class's definition and you need to recompile every class that depends upon it, in every dll etc.
But what I would like to know is how the compilers work in practice. I would hope that they just use the order that the methods are defined in the header/class, which would make appending additional methods safe. But they could also use a hash of the mangled names to make them order independent, but also then completely non-upgradable.
If people have specific knowledge of how specific versions of specific compilers work in different operating systems etc. then that would be most helpful.
Added: Ideally linker symbols would be created for the virtual methods offsets, so that the offsets would never be hard compiled into calling functions. But my understanding is that that is never done. Correct?
It appears that of Microsoft the VTable may be reordered.
The following is copied from https://marc.info/?l=kde-core-devel&m=139744177410091&w=2
I (Nicolas Alvarez) can confirm this behavior happens.
I compiled this class:
struct Testobj {
virtual void func1();
virtual void func2();
virtual void func3();
};
And a program that calls func1(); func2(); func3();
Then I added a func2(int) overload to the end:
struct Testobj {
virtual void func1();
virtual void func2();
virtual void func3();
virtual void func2(int);
};
and recompiled the class but not the program using the class.
Output of calling func1(); func2(); func3(); was
This is func1
This is func2 taking int
This is func2
This shows that if I declare func1() func2() func3() func2(int), the
vtable is laid out as func1() func2(int) func2() func3().
Tested with MSVC2010.
In MSVC 2010 they are in the order you declare them. I can't think of any rationale for another compiler doing it differently although it is an arbitrary choice. It only needs to be consistent. They are just arrays of pointers so don't worry about hashes or mangling.
No matter the order, additional virtual functions added in derived classes must come after those in the base or polymorphic casts would not work.
As far as I know they are always in the order of declarations. This way you can always add declarations of new virtual methods at the end (or below all previous declaration of virtual methods). If you remove any virtual method or add new one somewhere in the middle - you do need to recompile and relink everything.
I know that for sure - I already made that mistake. From my experience these rules apply to both MSVC and GCC.
Any compiler must at least place all the viable entries for a specific class together, with those for derived classes coming either before or afterwards, and also together.
The easiest way to accomplish that is to use the header order. It is difficult to see why any compiler would do anything different, given that it requires more code, more testing, etc., and just provides another way for mistakes to occur. No identifiable benefit that I can see.

Splitting long method maintaining class interface

In my library there's a class like this:
class Foo {
public:
void doSomething();
};
Now, implementation of doSomething() has been grow a lot and I want to split it in two methods:
class Foo {
public:
void doSomething();
private:
void doSomething1();
void doSomething2();
};
Where doSomething() implementation is this:
void Foo::doSomething() {
this->doSomething1();
this->doSomething2();
}
But now class interface has changed. If I compile this library, all existent applications using this library wont work, external linkage is changed.
How can I avoid breaking of binary compatibility?
I guess inlining solves this problem. Is it right? And is it portable? What happen if compiler optimization uninlines these methods?
class Foo {
public:
void doSomething();
private:
inline void doSomething1();
inline void doSomething2();
};
void Foo::doSomething1() {
/* some code here */
}
void Foo::doSomething2() {
/* some code here */
}
void Foo::doSomething() {
this->doSomething1();
this->doSomething2();
}
EDIT:
I tested this code before and after method splitting and it seems to maintain binary compatibility. But I'm not sure this would work in every OS and every compiler and with more complex classes (with virtual methods, inheritance...). Sometimes I had binary compatibility breaking after adding private methods like these, but now I don't remember in which particular situation. Maybe it was due to symbol tabled looked by index (like Steve Jessop notes in his answer).
Strictly speaking, changing the class definition at all (in either of the ways you show) is a violation of the One Definition Rule and leads to undefined behavior.
In practice, adding non-virtual member functions to a class maintains binary compatibility in every implementation out there, because if it didn't then you'd lose most of the benefits of dynamic libraries. But the C++ standard doesn't say much (anything?) about dynamic libraries or binary compatibility, so it doesn't guarantee what changes you can make.
So in practice, changing the symbol table doesn't matter provided that the dynamic linker looks up entries in the symbol table by name. There are more entries in the symbol table than before, but that's OK because all the old ones still have the same mangled names. It may be that with your implementation, private and/or inline functions (or any functions you specify) aren't dll-exported, but you don't need to rely on that.
I have used one system (Symbian) where entries in the symbol table were not looked up by name, they were looked up by index. On that system, when you added anything to a dynamic library you had to ensure that any new functions were added to the end of the symbol table, which you did by listing the required order in a special config file. You could ensure that binary compatibility wasn't broken, but it was fairly tedious.
So, you could check your C++ ABI or compiler/linker documentation to be absolutely sure, or just take my word for it and go ahead.
There is no problem here. The name mangling of Foo::doSomething() is always the same regardless of it's implementation.
I think the ABI of the class won't change if you add non-virtual methods because non-virtual methods are not stored in the class object, but rather as functions with mangled names. You can add as many functions as you like as long as you don't add class members.