How are C++ vtable methods ordered *In Practice* - c++

In theory, C++ does not have a binary interface, and the order of methods in the vtable is undefined. Change anything about a class's definition and you need to recompile every class that depends upon it, in every dll etc.
But what I would like to know is how the compilers work in practice. I would hope that they just use the order that the methods are defined in the header/class, which would make appending additional methods safe. But they could also use a hash of the mangled names to make them order independent, but also then completely non-upgradable.
If people have specific knowledge of how specific versions of specific compilers work in different operating systems etc. then that would be most helpful.
Added: Ideally linker symbols would be created for the virtual methods offsets, so that the offsets would never be hard compiled into calling functions. But my understanding is that that is never done. Correct?

It appears that of Microsoft the VTable may be reordered.
The following is copied from https://marc.info/?l=kde-core-devel&m=139744177410091&w=2
I (Nicolas Alvarez) can confirm this behavior happens.
I compiled this class:
struct Testobj {
virtual void func1();
virtual void func2();
virtual void func3();
};
And a program that calls func1(); func2(); func3();
Then I added a func2(int) overload to the end:
struct Testobj {
virtual void func1();
virtual void func2();
virtual void func3();
virtual void func2(int);
};
and recompiled the class but not the program using the class.
Output of calling func1(); func2(); func3(); was
This is func1
This is func2 taking int
This is func2
This shows that if I declare func1() func2() func3() func2(int), the
vtable is laid out as func1() func2(int) func2() func3().
Tested with MSVC2010.

In MSVC 2010 they are in the order you declare them. I can't think of any rationale for another compiler doing it differently although it is an arbitrary choice. It only needs to be consistent. They are just arrays of pointers so don't worry about hashes or mangling.
No matter the order, additional virtual functions added in derived classes must come after those in the base or polymorphic casts would not work.

As far as I know they are always in the order of declarations. This way you can always add declarations of new virtual methods at the end (or below all previous declaration of virtual methods). If you remove any virtual method or add new one somewhere in the middle - you do need to recompile and relink everything.
I know that for sure - I already made that mistake. From my experience these rules apply to both MSVC and GCC.

Any compiler must at least place all the viable entries for a specific class together, with those for derived classes coming either before or afterwards, and also together.
The easiest way to accomplish that is to use the header order. It is difficult to see why any compiler would do anything different, given that it requires more code, more testing, etc., and just provides another way for mistakes to occur. No identifiable benefit that I can see.

Related

When do we break binary compatibility

I was under the impression that whenever you do one of these:
Add a new public virtual method virtual void aMethod();
Add a new public non-virtual method void aMethod();
Implement a public pure-virtual method from an interface virtual void aMethod override;
Was actually breaking binary compatibility, meaning that if a project had build on a previous version of the DLL, it would not be able to load it now that there is new methods available.
From what I have tested using Visual Studio 2012, none of these break anything. Dependency Walker reports no error and my test application was calling the appropriate method.
DLL:
class EXPORT_LIB MyClass {
public:
void saySomething();
}
Executable:
int _tmain(int argc, _TCHAR* argv[])
{
MyClass wTest;
wTest.saySomething();
return 0;
}
The only undefined behavior I found was if MyClass was implementing an pure-virtual interface and from my executable, I was calling one of the pure-virtual method and then I added a new pure-virtual method before the one used by my executable. In this case, Dependency Walker did not report any error but at runtime, it was actually calling the wrong method.
class IMyInterface {
public:
virtual void foo();
}
In the executable
IMyInterface* wTest = new MyClass();
wTest->foo();
Then I change the interface without rebuilding my executable
class IMyInterface {
public:
virtual void bar();
virtual void foo();
}
It is now quietly calling bar() instead of foo().
Is it safe to do all of my three assumptions?
EDIT:
Doing this
class EXPORT_LIB MyClass {
public:
virtual void saySomething();
}
Exec
MyClass wTest;
wTest.saySomething();
Then rebuild DLL with this:
class EXPORT_LIB MyClass {
public:
virtual void saySomething2();
virtual void saySomething();
virtual void saySomething3();
}
Is calling the appropriate saySomething()
Breaking binary compatibility doesn't always result in the DLL not loading, in many cases you'll end up with memory corruption which may or may not be immediately obvious. It depends a lot on the specifics of what you've changed and how things were and now are laid out in memory.
Binary compatibility between DLLs is a complex subject. Lets start by looking at your three examples;
Add a new public virtual method virtual void aMethod();
This almost certainly will result in undefined behaviour, it's very much compiler dependant but most compilers will use some form of vtable for virtual methods, so adding new ones will change the layout of that table.
Add a new public non-virtual method void aMethod();
This is fine for a global function or a member function. A member function is essentially just a global function with a hidden 'this' argument. It doesn't change the memory layout of anything.
Implement a public pure-virtual method from an interface virtual void aMethod override;
This won't exactly cause any undefined behaviour but as you've found, it won't do what you expect. Code that was compiled against the previous version of the library won't know this function has been overridden, so will not call the new implementation, it'll carry on calling the old impl. This may or may not be a problem depending on your use case, it shouldn't cause any other side effects. However I think your mileage could vary here depending on what compiler you're using. So it's probably best to avoid this.
What will stop a DLL from being loaded is if you change the signature of an exported function in any way (including changing parameters and scope) or if you remove a function. As then the dynamic linker won't be able to find it. This only applies if the function in question is being used as the linker only imports functions that are referenced in the code.
There are also many more ways to break binary compatibility between dlls, which are beyond the scope of this answer. In my experience they usually follow a theme of changing the size or layout of something in memory.
Edit: I just remembered that there is an excellent article on the KDE Wiki on binary compatibility in C++ including a very good list of do's and don'ts with explanations and work arounds.
C++ doesn't say.
Visual Studio generally follows COM rules, allowing you to add virtual methods to the end of your most derived class unless they are overloads.
Any non-static data member will change the binary layout as well.
Non-virtual functions don't affect binary compatibility.
Templates make a huge mess because of name mangling.
Your best bet to retain binary compatibility is to use both the pimpl idiom and the nvi idiom quite liberally.

Splitting long method maintaining class interface

In my library there's a class like this:
class Foo {
public:
void doSomething();
};
Now, implementation of doSomething() has been grow a lot and I want to split it in two methods:
class Foo {
public:
void doSomething();
private:
void doSomething1();
void doSomething2();
};
Where doSomething() implementation is this:
void Foo::doSomething() {
this->doSomething1();
this->doSomething2();
}
But now class interface has changed. If I compile this library, all existent applications using this library wont work, external linkage is changed.
How can I avoid breaking of binary compatibility?
I guess inlining solves this problem. Is it right? And is it portable? What happen if compiler optimization uninlines these methods?
class Foo {
public:
void doSomething();
private:
inline void doSomething1();
inline void doSomething2();
};
void Foo::doSomething1() {
/* some code here */
}
void Foo::doSomething2() {
/* some code here */
}
void Foo::doSomething() {
this->doSomething1();
this->doSomething2();
}
EDIT:
I tested this code before and after method splitting and it seems to maintain binary compatibility. But I'm not sure this would work in every OS and every compiler and with more complex classes (with virtual methods, inheritance...). Sometimes I had binary compatibility breaking after adding private methods like these, but now I don't remember in which particular situation. Maybe it was due to symbol tabled looked by index (like Steve Jessop notes in his answer).
Strictly speaking, changing the class definition at all (in either of the ways you show) is a violation of the One Definition Rule and leads to undefined behavior.
In practice, adding non-virtual member functions to a class maintains binary compatibility in every implementation out there, because if it didn't then you'd lose most of the benefits of dynamic libraries. But the C++ standard doesn't say much (anything?) about dynamic libraries or binary compatibility, so it doesn't guarantee what changes you can make.
So in practice, changing the symbol table doesn't matter provided that the dynamic linker looks up entries in the symbol table by name. There are more entries in the symbol table than before, but that's OK because all the old ones still have the same mangled names. It may be that with your implementation, private and/or inline functions (or any functions you specify) aren't dll-exported, but you don't need to rely on that.
I have used one system (Symbian) where entries in the symbol table were not looked up by name, they were looked up by index. On that system, when you added anything to a dynamic library you had to ensure that any new functions were added to the end of the symbol table, which you did by listing the required order in a special config file. You could ensure that binary compatibility wasn't broken, but it was fairly tedious.
So, you could check your C++ ABI or compiler/linker documentation to be absolutely sure, or just take my word for it and go ahead.
There is no problem here. The name mangling of Foo::doSomething() is always the same regardless of it's implementation.
I think the ABI of the class won't change if you add non-virtual methods because non-virtual methods are not stored in the class object, but rather as functions with mangled names. You can add as many functions as you like as long as you don't add class members.

Change pure virtual to virtual and stay binary compatible

Can I change a pure-virtual function (in a base class) to become non-pure without running into any binary compatibility issues? (Linux, GCC 4.1)
thanks
There is no compatibility issues when you switch from pure virtual to virtual and then re-compile the code. (However, virtual to pure virtual may cause problems.)
The only thing you should take care is, that the non-pure virtual methods must have a body. They cannot remain unimplemented. i.e.
class A {
public:
virtual int foo ()
{
return 0; //put some content
}
};
You cannot simply put like,
virtual int foo();
It will cause linker error, even if you don't use it.
What does it mean to maintain binary compatibility to you?
The object layout will be the same, but you will be breaking the One Definition Rule unless you recompile all code, at which point binary compatibility is basically useless. Without recompiling, then the ODR is broken, and while it might be the case that it works, it might also not work.
In particular if all of the virtual methods in the class are either pure or defined inline, then the compiler might generate the vtable in each translation unit that includes the header and mark it as a weak symbol. Then the linker will pick one of them and discard all the others. In this situation the linker is not required to verify that all of the vtables are exactly the same and will pick one at random (or deterministically in an undefined way), and it might pick one such vtable where the method is pure virtual, which in turn might end up crashing the application if the method is called on an object of the base class.

Inline and Virtual

guys. I have read several threads about the interaction between inline and virtual co-existing in one function. In most cases, compilers won't consider it as inline. However, is the principle applied to the scenario when a non-virtual inline member function call a virtual function? say:
class ABC{
public:
void callVirtual(){IAmVitrual();}
protected:
virtual void IAmVirtual();
};
What principle? I would expect the compiler to generate a call to the virtual function. The call (in effect a jump-to-function-pointer) may be inlined but the IAmVirtual function is not.
The virtual function itself is not inline, and it is not called with qualification needed to inline it even if it were, so it can't be inlined.
The whole point of virtual functions is that the compiler generally doesn't know which of the derived class implementations will be needed at run-time, or even if extra derived classes will be dynamically loaded from shared libraries. So, in general, it's impossible to inline. The one case that the compiler can inline is when it happens to know for sure which type it's dealing with because it can see the concrete type in the code and soon afterwards - with no chance of the type having changed - see the call to the virtual function. Even then, it's not required to try to optimise or inline, it's just the only case where it's even possible.
You shouldn't try to fight this unless the profiler's proven the virtual calls are killing you. Then, first try to group a bunch of operations so one virtual call can do more work for you. If virtual dispatch is still just too slow, consider maintaining some kind of discriminated union: it's a lot less flexible and cleanly extensible, but can avoid the virtual function call overheads and allow inlining.
All that assumes you really need dynamic dispatch: some programmers and systems over-use virtual functions just because OO was the in thing 20 years ago, or they've used an OO-only language like Java. C++ has a rich selection of compile-time polymorphic mechanisms, including templates.
In your case callVirtual() will be inlined. Any non-virtual function can be a good candidate of being inline (obviously last decision is upto compiler).
Virtual functions have to be looked up in the Virtual Method Table, and as a result the compiler cannot simply move them to be inline. This is generally a runtime look up. An inline function however may call a virtual one and the compiler can put that call (the code to look up the call in the VMT) inline.

adding virtual function to the end of the class declaration avoids binary incompatibility?

Could someone explain to me why adding a virtual function to the end of a class declaration avoids binary incompatibility?
If I have:
class A
{
public:
virtual ~A();
virtual void someFuncA() = 0;
virtual void someFuncB() = 0;
virtual void other1() = 0;
private:
int someVal;
};
And later modify this class declaration to:
class A
{
public:
virtual ~A();
virtual void someFuncA() = 0;
virtual void someFuncB() = 0;
virtual void someFuncC() = 0;
virtual void other1() = 0;
private:
int someVal;
};
I get a coredump from another .so compiled against the previous declaration. But if I put someFuncC() at the end of the class declaration (after "int someVal"):
class A
{
public:
virtual ~A();
virtual void someFuncA() = 0;
virtual void someFuncB() = 0;
virtual void other1() = 0;
private:
int someVal;
public:
virtual void someFuncC() = 0;
};
I don't see coredump anymore. Could someone tell me why this is? And does this trick always work?
PS. compiler is gcc, does this work with other compilers?
From another answer:
Whether this leads to a memory leak, wipes your hard disk, gets you pregnant, makes nasty Nasal Demons chasing you around your apartment, or lets everything work fine with no apparent problems, is undefined. It might be this way with one compiler, and change with another, change with a new compiler version, with each new compilation, with the moon phases, your mood, or depending on the number or neutrinos that passed through the processor on the last sunny afternoon. Or it might not.
All that, and an infinite amount of other possibilities are put into one term: Undefined behavior:
Just stay away from it.
If you know how a particular compiler version implements its features, you might get this to work. Or you might not. Or you might think it works, but it breaks, but only if whoever sits in front of the computer just ate a yogurt.
Is there any reason why you want this to work anyway?
I suppose it works/doesn't work the way it does/doesn't because your compiler creates virtual table entries in the order the virtual functions are declared. If you mess up the order by putting a virtual function in between the others, then when someone calls other1(), instead someFuncC() is called, possibly with wrong arguments, but definitely at the wrong moment. (Be glad it crashes immediately.)
This is, however, just a guess, and even if it is a right one, unless your gcc version comes with a document somewhere describing this, there's no guarantee it will work this way tomorrow even with the same compiler version.
I'm a bit surprised that this particular rearrangement helps at all. It's certainly not guaranteed to work.
The class you give above will normally be translated to something on this order:
typedef void (*vfunc)(void);
struct __A__impl {
vfunc __vtable_ptr;
int someVal;
};
__A__impl__init(__A__impl *object) {
static vfunc virtual_functions[] = { __A__dtor, __A__someFuncA, __A__someFuncB};
object->__vtable__ptr = virtual_functions;
}
When/if you add someFuncC, you should normally get another entry added to the class' virtual functions table. If the compiler arranges that before any of the other functions, you'll run into a problem where attempting to invoke one function actually invokes another. As long as its address is at the end of the virtual function table, things should still work. C++ doesn't guarantee anything about how vtables are arranged though (or even that there is a vtable).
With respect to normal data, (non-static) members are required to be arranged in ascending order as long as there isn't an intervening access specificer (public:, protected: or private:).
If the compiler followed the same rules when mapping virtual function declarations to vtable positions, your first attempt should work, but your second could break. Obviously enough, there's no guarantee of that though -- as long as it works consistently, the compiler can arrange the vtable about any way it wants to.
Maybe G++ puts its "private" virtual table in a different place than its public one.
At any rate, if you can somehow trick the compiler into thinking an object looks differently than it really does and then use that object you are going to invoke nasal demons. You can't depend on anything they do. This is basically what you are doing.
I suggest avoiding anything related to binary compatibility or serialization with a class. This opens up too many cans of worms and there is not enough fish to eat them all.
When transferring data, prefer XML or an ASCII based protocol with field definitions over a binary protocol. A flexible protocal will help people (including you) debug and maintain the protocols. ASCII and XML protocols offer a higher readability than binary.
Classes, objects, structs and the like, should never be compared as a whole by binary image. The preferred method is for each class to implement its own comparison operators or functions. After all, the class is the expert on how to compare its data members. Some data members, such a pointers and containers, can really screw up binary comparisons.
Since processors are fast, memory is cheap, change your principles from space saving to correctness and robustness. Optimize for speed or space after the program is bug-free.
There are too many horror stories posted to Stack Overflow and the Newsgroups related to binary protocols and binary comparisons (and assignments) of objects. Let's reduce the amount of new horror stories by learning from the existing ones.
Binary compatibility is a bad thing. You should use a plaintext-based system, perhaps XML.
Edit: Misplaced the meaning of your question a little. Windows provides many native ways to share data, for example GetProcAddress, __declspec(dllexport) and __declspec(dllimport). GCC must offer something similar. Binary serialization is a bad thing.
Edit again:
Actually, he didn't mention executables in his post. At all. Nor what he was trying to use his binary compatibility for.