C++ ABI changes [duplicate] - c++

As far as I've understood, it is not possible to link libraries that use different versions of GCC's Application Binary Interface (ABI). Are there ABI changes to every version of GCC? Is it possible to link a library built with 4.3.1 if I use, say, GCC 4.3.2? Is there a matrix of some sort that lists all the ways I can combine GCC versions?

Since gcc-3.4.0, the ABI is forward compatible. I.E. a library made using an older release can be linked with a newer one and it should work (the reverse doesn't). Obviously, there could be bugs, but there is only one mentionned in the documentation: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33678

The official ABI page points to an ABIcheck. This tool may do, what you want.

Ugh, yikes.
How can you tell which gcc compiled a given binary? Here is the
death notice from gcc-4.7.2-1-mingw32.README.txt :
Binary incompatibility notice!
The C and C++ ABI changed in GCC 4.7.0, which means in general you can't
link together binaries compiled with this version of the compiler and
with versions before GCC 4.7.0. In particular:
The option -mms-bitfields is enabled by default, which means the bitfield layout
follows the convention of the Microsoft compiler.
C++ class-member functions now follow the __thiscall calling convention.
The compiler now assumes that the caller pops the stack for the
implicit arguments pointing to an aggregate return value. This affects
functions returning structs by value, like the complex math type.

Related

GCC ABI compatibility between versions 10 and 11 [duplicate]

As far as I've understood, it is not possible to link libraries that use different versions of GCC's Application Binary Interface (ABI). Are there ABI changes to every version of GCC? Is it possible to link a library built with 4.3.1 if I use, say, GCC 4.3.2? Is there a matrix of some sort that lists all the ways I can combine GCC versions?
Since gcc-3.4.0, the ABI is forward compatible. I.E. a library made using an older release can be linked with a newer one and it should work (the reverse doesn't). Obviously, there could be bugs, but there is only one mentionned in the documentation: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33678
The official ABI page points to an ABIcheck. This tool may do, what you want.
Ugh, yikes.
How can you tell which gcc compiled a given binary? Here is the
death notice from gcc-4.7.2-1-mingw32.README.txt :
Binary incompatibility notice!
The C and C++ ABI changed in GCC 4.7.0, which means in general you can't
link together binaries compiled with this version of the compiler and
with versions before GCC 4.7.0. In particular:
The option -mms-bitfields is enabled by default, which means the bitfield layout
follows the convention of the Microsoft compiler.
C++ class-member functions now follow the __thiscall calling convention.
The compiler now assumes that the caller pops the stack for the
implicit arguments pointing to an aggregate return value. This affects
functions returning structs by value, like the complex math type.

Please explain the C++ ABI

The common explanation for not fixing some issues with C++ is that it would break the ABI and require recompilation, but on the other hand I encounter statements like this:
Honestly, this is true for pretty much all C++ non-POD types, not just exceptions. It is possible to use C++ objects across library boundaries but generally only so long as all of the code is compiled and linked using the same tools and standard libraries. This is why, for example, there are boost binaries for all of the major versions of MSVC.
(from this SO answer)
So does C++ have a stable ABI or not?
If it does, can I mix and match executables and libraries compiled with different toolsets on the same platform (for example VC++ and GCC on Windows)? And if it does not, is there any way to do that?
And more importantly, if there is no stable ABI in C++, why are people so concerned about breaking it?
Although the C++ Standard doesn't prescribe any ABI, some actual implementations try hard to preserve ABI compatibility between versions of the toolchain. E.g. with GCC 4.x, it was possible to use a library linked against an older version of libstdc++, from a program that's compiled by a newer toolchain with a newer libstdc++. The older versions of the symbols expected by the library are provided by the newer libstdc++.so, and layouts of the classes defined in the C++ Standard Library are the same.
But when C++11 introduced the new requirements to std::string and std::list, these couldn't be implemented in libstdc++ without changing the layout of these classes. This means that, if you don't use the _GLIBCXX_USE_CXX11_ABI=0 kludge with GCC 5 and higher, you can't pass e.g. std::string objects between a GCC4-compiled library and a GCC5-compiled program. So the ABI was broken.
Some C++ implementations don't try that hard to have compatible ABI: e.g. MSVC++ doesn't provide such compatibility between major compiler releases (see this question), so one has to provide different versions of library to use with different versions of MSVC++.
So, in general, you can't mix and match libraries and executables compiled with different versions even of the same toolchain.
C++ does not have an ABI standard as of yet. They are attempts to have it in the standard; You can read following it explains it in details:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p2028r0.pdf

If clang++ and g++ are ABI incompatible, what is used for shared libraries in binary?

clang++ and g++ are ABI incompatible, even for things as core as standard containers, according to, e.g., the clang++ website.
Debian ships with C++ shared libraries, i.e. libboost, etc... that are compiled with ~something and user programs using both compiler generally work, and the library names aren't mangled with the compiler that was used for them. When you install clang, debian doesn't go and pull in duplicate versions of every C++ library installed on your system.
What's the deal? Is the ability of clang to link against distro-provided C++ libraries just way stronger than the (thankfully cautious) compiler devs describe it to be?
even for things as core as standard containers
Standard containers are not all that "core". (For typical implementations) they are implemented entirely in valid C++ in headers, and if you compile the same headers with G++ and Clang++ you'll get ABI compatible output. You should only get incompatibilities "even for things as core as standard containers" if you use different versions of the container headers, not just by using Clang instead of GCC.
Both GCC and Clang conform to a cross-vendor, cross-platform C++ ABI (originally developed for the Itanium architecture, but also used for x86, x86_64, SPARC etc.) The really core things such as class layout, name mangling, exception handling, vtables etc. are specified by that ABI and Clang and GCC both follow it.
So in other words, if you compile the same source with GCC and Clang you'll get ABI-compatible binaries.
If you want to understand this stuff better see my What's an ABI and why is it so complicated? slides.
G++ and Clang are for the vast majority completely ABI compatible. Furthermore, ABI incompatibilities for Standard containers are properties of the standard library implementation (libstdc++ or libc++), not the compiler. Therefore, there is no need for any re-compilation.
Clang could never have gotten off the ground if it was not ABI compatible with g++, as it would be basically unusable without a pre-existing large following. In fact, Clang is so compatible with GCC, they ape virtually all of g++'s command-line interface, compiler intrinsics, bugs, etc, so that you can literally just drop in Clang instead of G++ and the vast majority of the time, everything will just work.
This probably will not answer the exact question correctly:
Some time ago I tried to compile some object files wih gcc, another object files with clang. Finally I linked everything together and it worked correctly.
I believe Linux distributions uses gcc, because I examined some Makefile's of Ubuntu and CentOS and they used gcc.

Can you mix c++ compiled with different versions of the same compiler

For example could I mix a set of libraries that have been compiled in say GCC-4.6 with GCC-4.9.
I'm aware different compilers "breeds" such as VS cannot be with MinGW but can different generations of the same compiler? Are issues likely to occur? If so what?
Different generations of the same compiler sometimes can be compatible with each other, but not always. For example, GCC 4.7.0 changed its C/C++ ABI, meaning libraries compiled with 4.7.0+ and 4.7.0- are not likely to be compatible with each other (so in your example, the library compiled with 4.6 will not be compatible with the library compiled with 4.9). There can also be ABI bugs within a given compiler release, as happened in GCC 4.7.0/4.7.1:
GCC versions 4.7.0 and 4.7.1 had changes to the C++ standard library which affected the ABI in C++11 mode: a data member was added to std::list changing its size and altering the definitions of some member functions, and std::pair's move constructor was non-trivial which altered the calling convention for functions with std::pair arguments or return types. The ABI incompatibilities have been fixed for GCC version 4.7.2 but as a result C++11 code compiled with GCC 4.7.0 or 4.7.1 may be incompatible with C++11 code compiled with different GCC versions and with C++98/C++03 code compiled with any version.
The GCC ABI Policy and Guidelines page indicates they try to maintain forward compatibility, but not backward compatibility:
Versioning gives subsequent releases of library binaries the ability to add new symbols and add functionality, all the while retaining compatibility with the previous releases in the series. Thus, program binaries linked with the initial release of a library binary will still run correctly if the library binary is replaced by carefully-managed subsequent library binaries. This is called forward compatibility.
The reverse (backwards compatibility) is not true. It is not possible to take program binaries linked with the latest version of a library binary in a release series (with additional symbols added), substitute in the initial release of the library binary, and remain link compatible.
That page also has some fairly lengthy explanations on the versioning system GCC uses to mark different versions of given components, as well as an explanation for the versioning behind GCC itself:
Allowed Changes
The following will cause the library minor version number to increase, say from "libstdc++.so.3.0.4" to "libstdc++.so.3.0.5".
Adding an exported global or static data member
Adding an exported function, static or non-virtual member function
Adding an exported symbol or symbols by additional instantiations
Other allowed changes are possible.
Prohibited Changes
The following non-exhaustive list will cause the library major version number to increase, say from "libstdc++.so.3.0.4" to "libstdc++.so.4.0.0".
Changes in the gcc/g++ compiler ABI
Changing size of an exported symbol
Changing alignment of an exported symbol
Changing the layout of an exported symbol
Changing mangling on an exported symbol
Deleting an exported symbol
Changing the inheritance properties of a type by adding or removing base classes
Changing the size, alignment, or layout of types specified in the C++ standard. These may not necessarily be instantiated or otherwise exported in the library binary, and include all the required locale facets, as well as things like std::basic_streambuf, et al.
Adding an explicit copy constructor or destructor to a class that would otherwise have implicit versions. This will change the way the compiler deals with this class in by-value return statements or parameters: instead of passing instances of this class in registers, the compiler will be forced to use memory. See the section on Function Calling Conventions and APIs of the C++ ABI documentation for further details.
Note the bolded bit. In a perfect world, GCC versions with the same major release number would be binary-compatible. This isn't a perfect world, so test very very carefully before you go mixing compiler versions like this, but in general you'll probably be okay.
You can only mix generated binary files from different compilers or different versions of the same compiler if they are ABI (Application Binary Interface) compatible.
Things like:
Calling procedure
Name mangling
Thread local storage handling
are all part of the ABI.
If one of these things change, you will find that you either get linker errors, crashes or other forms of unexpected behaviour.
As a general rule, compiler vendors will often try maintaining at least backwards compatibility with older version, but there is no guarantee of this. As other have said you must either read the documentation or just recompile everything.

Are g++ and clang++ 100% binary compatible? [duplicate]

If I build a static library with llvm-gcc, then link it with a program compiled using mingw gcc, will the result work?
The same for other combinations of llvm-gcc, clang and normal gcc. I'm interested in how this works out on Linux (using normal non-mingw gcc, of course) and other platforms as well, but the emphasis is on Windows.
I'm also interested in all languages, but with a strong emphasis on C and C++ - obviously clang doesn't support Fortran etc, but I believe llvm-gcc does.
I assume they all use the ELF file format, but what about call conventions, virtual table layouts etc?
Yes, for C code Clang and GCC are compatible (they both use the GNU Toolchain for linking, in fact.) You just have to make sure that you tell clang to create compiled objects and not intermediate bitcode objects. C ABI is well-defined, so the only issue is storage format.
C++ is not portable between compilers in the slightest; different compilers use different virtual table calls, constructors, destruction, name mangling, template implementations, etc. As a rule you should assume objects from one C++ compiler will not work with another.
However yes, at the time of writing Clang++ is able to use GCC/C++ compiled libraries as well; I recently set up a rig to compile C++ programs with clang using G++'s standard runtime library and it compiles+links just fine.
I don't know the answer, but slide 10 in this presentation seems to imply that the ".o" files produced by llvmgcc contain LLVM bytecode (.bc) instead of the usual target-specific object code, so that link-time optimization is possible. However, the LLVM linker should be able to link LLVM code with code produced by "normal" GCC, as the next slide says "link in native .o files and libraries here".
LLVM is a Linux tool, I have sometimes found that Linux compilers don't work quite right on Windows. I would be curious whether you get it to work or not.
I use -m i386pep when linking clang's .o files by ld. llvm's devotion to integrating with gcc is seen openly at http://dragonegg.llvm.org/ so its very intuitive to guess llvm family will greatly be cross-compatible with gcc tool-chain.
Sorry - I was coming back to llvm after a break, and have never done much more than the tutorial. First time around, I kind of burned out after the struggle getting LLVM 2.6 to build on MinGW GCC - thankfully not a problem with LLVM 2.7.
Going through the tutorial again today I noticed in Chapter 5 of the tutorial not only a clear statement that LLVM uses the ABI (Application Binary Interface) of the platform, but also that the tutorial compiler depends on this to allow access to external functions such as sin and cos.
I still don't know whether the compatible ABI extends to C++, though. That's not an issue of call conventions so much as name mangling, struct layout and vtable layout.
Being able to make C function calls is enough for most things, there's still a few issues where I care about C++.
Hopefully they fixed it but I avoid llvm-gcc because I (also) use llvm as a cross compiler and when you use llvm-gcc -m32 on a 64 bit machine the -m32 is ignored and you get 64 bit ints which have to be faked on your 32 bit target machine. Clang does not have that bug nor does gcc. Also the more I use clang the more I like. As to your direct question, dont know, in theory these days targets have well known or used calling conventions. And you would hope both gcc and llvm conform to the same but you never know. the simplest way to find this out is to write a couple of simple functions, compile and disassemble using both tool sets and see how they pass operands to the functions.