Are g++ and clang++ 100% binary compatible? [duplicate] - c++

If I build a static library with llvm-gcc, then link it with a program compiled using mingw gcc, will the result work?
The same for other combinations of llvm-gcc, clang and normal gcc. I'm interested in how this works out on Linux (using normal non-mingw gcc, of course) and other platforms as well, but the emphasis is on Windows.
I'm also interested in all languages, but with a strong emphasis on C and C++ - obviously clang doesn't support Fortran etc, but I believe llvm-gcc does.
I assume they all use the ELF file format, but what about call conventions, virtual table layouts etc?

Yes, for C code Clang and GCC are compatible (they both use the GNU Toolchain for linking, in fact.) You just have to make sure that you tell clang to create compiled objects and not intermediate bitcode objects. C ABI is well-defined, so the only issue is storage format.
C++ is not portable between compilers in the slightest; different compilers use different virtual table calls, constructors, destruction, name mangling, template implementations, etc. As a rule you should assume objects from one C++ compiler will not work with another.
However yes, at the time of writing Clang++ is able to use GCC/C++ compiled libraries as well; I recently set up a rig to compile C++ programs with clang using G++'s standard runtime library and it compiles+links just fine.

I don't know the answer, but slide 10 in this presentation seems to imply that the ".o" files produced by llvmgcc contain LLVM bytecode (.bc) instead of the usual target-specific object code, so that link-time optimization is possible. However, the LLVM linker should be able to link LLVM code with code produced by "normal" GCC, as the next slide says "link in native .o files and libraries here".
LLVM is a Linux tool, I have sometimes found that Linux compilers don't work quite right on Windows. I would be curious whether you get it to work or not.

I use -m i386pep when linking clang's .o files by ld. llvm's devotion to integrating with gcc is seen openly at http://dragonegg.llvm.org/ so its very intuitive to guess llvm family will greatly be cross-compatible with gcc tool-chain.

Sorry - I was coming back to llvm after a break, and have never done much more than the tutorial. First time around, I kind of burned out after the struggle getting LLVM 2.6 to build on MinGW GCC - thankfully not a problem with LLVM 2.7.
Going through the tutorial again today I noticed in Chapter 5 of the tutorial not only a clear statement that LLVM uses the ABI (Application Binary Interface) of the platform, but also that the tutorial compiler depends on this to allow access to external functions such as sin and cos.
I still don't know whether the compatible ABI extends to C++, though. That's not an issue of call conventions so much as name mangling, struct layout and vtable layout.
Being able to make C function calls is enough for most things, there's still a few issues where I care about C++.

Hopefully they fixed it but I avoid llvm-gcc because I (also) use llvm as a cross compiler and when you use llvm-gcc -m32 on a 64 bit machine the -m32 is ignored and you get 64 bit ints which have to be faked on your 32 bit target machine. Clang does not have that bug nor does gcc. Also the more I use clang the more I like. As to your direct question, dont know, in theory these days targets have well known or used calling conventions. And you would hope both gcc and llvm conform to the same but you never know. the simplest way to find this out is to write a couple of simple functions, compile and disassemble using both tool sets and see how they pass operands to the functions.

Related

If clang++ and g++ are ABI incompatible, what is used for shared libraries in binary?

clang++ and g++ are ABI incompatible, even for things as core as standard containers, according to, e.g., the clang++ website.
Debian ships with C++ shared libraries, i.e. libboost, etc... that are compiled with ~something and user programs using both compiler generally work, and the library names aren't mangled with the compiler that was used for them. When you install clang, debian doesn't go and pull in duplicate versions of every C++ library installed on your system.
What's the deal? Is the ability of clang to link against distro-provided C++ libraries just way stronger than the (thankfully cautious) compiler devs describe it to be?
even for things as core as standard containers
Standard containers are not all that "core". (For typical implementations) they are implemented entirely in valid C++ in headers, and if you compile the same headers with G++ and Clang++ you'll get ABI compatible output. You should only get incompatibilities "even for things as core as standard containers" if you use different versions of the container headers, not just by using Clang instead of GCC.
Both GCC and Clang conform to a cross-vendor, cross-platform C++ ABI (originally developed for the Itanium architecture, but also used for x86, x86_64, SPARC etc.) The really core things such as class layout, name mangling, exception handling, vtables etc. are specified by that ABI and Clang and GCC both follow it.
So in other words, if you compile the same source with GCC and Clang you'll get ABI-compatible binaries.
If you want to understand this stuff better see my What's an ABI and why is it so complicated? slides.
G++ and Clang are for the vast majority completely ABI compatible. Furthermore, ABI incompatibilities for Standard containers are properties of the standard library implementation (libstdc++ or libc++), not the compiler. Therefore, there is no need for any re-compilation.
Clang could never have gotten off the ground if it was not ABI compatible with g++, as it would be basically unusable without a pre-existing large following. In fact, Clang is so compatible with GCC, they ape virtually all of g++'s command-line interface, compiler intrinsics, bugs, etc, so that you can literally just drop in Clang instead of G++ and the vast majority of the time, everything will just work.
This probably will not answer the exact question correctly:
Some time ago I tried to compile some object files wih gcc, another object files with clang. Finally I linked everything together and it worked correctly.
I believe Linux distributions uses gcc, because I examined some Makefile's of Ubuntu and CentOS and they used gcc.

How is clang able to steer C/C++ code optimization?

I was told that clang is a driver that works like gcc to do preprocessing, compilation and linkage work. During the compilation and linkage, as far as I know, it's actually llvm that does the optimization ("-O1", "-O2", "-O3", "-Os", "-flto").
But I just cannot understand how llvm is involved.
It seems that compiling source code doesn't even need a static library such as libLLVMCore.a, instead for debian clang packages depends on another package called libllvm-3.4(clang version is 3.4), which contains libLLVM-3.4.so(.1), does clang use this shared library for optimization?
I've checked clang source code for a while and found that include/clang/Driver/Options.td contains the related options, but unfortunately I failed to find the source files that include that file, so I'm still not aware of the mechanism.
I hope someone might give me some hints.
(TL;DontWannaRead - skip to the end of this answer)
To answer your question properly you first need to understand the difference between a compiler's front-end and back-end (especially the first one).
Clang is a compiler front-end (http://en.wikipedia.org/wiki/Clang) for C, C++, Objective C and Objective C++ languages.
Clang's duty is the following:
i.e. translating from C++ source code (or C, or Objective C, etc..) to LLVM IR, a textual lower-level representation of what should that code do. In order to do this Clang employs a number of sub-modules whose descriptions you could find in any decent compiler construction book: lexer, parser + a semantic analyzer (Sema), etc..
LLVM is a set of libraries whose primary task is the following: suppose we have the LLVM IR representation of the following C++ function
int double_this_number(int num) {
int result = 0;
result = num;
result = result * 2;
return result;
}
the core of the LLVM passes should optimize LLVM IR code:
What to do with the optimized LLVM IR code is entirely up to you: you can translate it to x86_64 executable code or modify it and then spit it out as ARM executable code or GPU executable code. It depends on the goal of your project.
The term "back-end" is often confusing since there are many papers that would define the LLVM libraries a "middle end" in a compiler chain and define the "back end" as the final module which does the code generation (LLVM IR to executable code or something else which no longer needs processing by the compiler). Other sources refer to LLVM as a back end to Clang. Either way, their role is clear and they offer a powerful mechanism: whatever the language you're targeting (C++, C, Objective C, Python, etc..) if you have a front-end which translates it to LLVM IR, you can use the same set of LLVM libraries to optimize it and, as long as you have a back-end for your target architecture, you can generate optimized executable code.
Recalling that LLVM is a set of libraries (not just optimization passes but also data structures, utility modules, diagnostic modules, etc..), Clang also leverages many LLVM libraries during its front-ending process. You can't really tear every LLVM module away from Clang since the latter is built on the former set.
As for the reason why Clang is said to be a "compilation driver": Clang manages interpreting the command line parameters (descriptions and many declarations are TableGen'd and they might require a bit more than a simple grep to swim through the sources), decides which Jobs and phases are to be executed, set up the CodeGenOptions according to the desired/possible optimization and transformation levels and invokes the appropriate modules (clangCodeGen in BackendUtil.cpp is the one that populates a module pass manager with the optimizations to apply) and tools (e.g. the Windows ld linker). It steers the compilation process from the very beginning to the end.
Finally I would suggest reading Clang and LLVM documentation, they're pretty explicative and most of your questions should look for an answer there in the first place.
It's not exactly like GCC, so don't spend too much time trying to match the two precisely.
The LLVM compiler is a compiler for one specific language, LLVM. What Clang does is compile C++ code to LLVM, without optimizations. Clang can then invoke the LLVM compiler to compile that LLVM code to optimized assembly.

Does using -std=c++11 break binary compatibility?

I've looked hard for this question - it seems an obvious one to ask - but I haven't found it: Is a module compiled with "-std=c++11" (g++) binary compatible with modules that are not compiled with the option? (That is, can I link them together safely?) Both compilations would use the exact same version of g++.
To be more precise, using gcc 4.9.0, can I only use the "-std=c++11" on specific compilation units and then let the others compile without the option.
An authoritative reference can be found in gcc's C++11 ABI Compatibility page.
The short summary is: the are no language reasons the ABI gets broken but there are a number of mandated changes which cause the standard C++ library shipping with gcc to change.

Compiling and linking with a different versions of gcc on linux

I am planning to compile a static library (mylib.a) with gcc 4.7.1. I want to take the advantages of C++11, so -std=c++11 is used. The platform, where I compile this lib is x86_64 SLES 11 with glibc-2.8.
Then I want to link this static library on a legacy platform with a legacy code, therefore I must use gcc 4.1.2 for linking and compiling the legacy code. So in my library headers I will not use any C++11 specific code. Also I will link libstdc++.a from gcc.4.7.1. The platform, where I want to link mylib.a, libstdc++.a(gcc4.7.1) and the legacy object files is x86_64 SLES 10 with glibc-2.4.
I tried all of this mess with some dummy C++11 code (std::async()) in mylib.a and it worked. I think this is possible only becuase of the ELF requiriements. Am I thinking correctly, or ELF has nothing to do with it? What kind of errors should I expect if mylib.a will contain some truly complex logic?
Linux has a C++ Application Binary Interface (ABI), which has been around for a while. This means that the calling conventions and name mangling across compilers on Linux is fixed. Therefore, as long as the libraries are compatible, you should be able to compiler with different compilers (or different versions of the same compiler) and have code which correctly and reliably links together.
Not entirely the ELF requirements per se...
GCC guarantees binary compatibility all the way back to some ancient version of 3. As long as the libstdc++ you're linking to has the new library features, there's no reason you can't use them. You will just have to stay away from the new language and library features in code compiled with GCC 4.1.2.

Clang vs. LLVMC -- what's the difference?

What's the difference between llvmc.exe and clang.exe? Which one do I use for compiling C or C++ code?
llvmc is a frontend for various programs in the LLVM toolchain, in particular the llvm-* ones, ie by default it will try to use llvm-gcc and llvm-g++ to compile C and C++ files.
You can pass -clang to llvmc if that's what you want to use, and it's probably possible to configure llvmc so clang will be used by default, but I have no idea how to do that.
I'd recommend to just use clang and clang++ directly, which can be used as drop-in replacements for gcc and g++.
llvmc was an experimental driver that was intended to support multiple different source languages. Clang and Clang++ have always been the preferred way to drive the (C / C++ / Objective-C) compiler. In fact, llvmc has been removed from mainline.
In short, you should definitely use "clang" and never "llvmc".
LLVM originally stands for Low-Level Virtual Machine, and is today mostly used either:
as a backend optimizer/compiler
as a JIT compiler
On the other hand, Clang is a collection of libraries for dealing with the C language family that notably contains a compiler (clang) which acts as a front-end for C, C++, Objective-C and Objective-C++ on top of the LLVM libraries.
So, in your case, you will want to use clang and clang++ to compile C and C++ respectively, and don't worry about the fact that LLVM is used behind the scenes to optimize your code and deal with generation of machine instructions adapted to your architecture.