How does C++ linkage works in this case? - c++

I built a C++ library for making games in most known platforms and it works alright. However, I ran into a minor problem with the linker, which I can't live with myself without figuring the answer.
I have two separate classes in two separate files: KinesisWorld in KinesisWorld.h and ASEngine in ASEngine.h . Both of them have a similar behavior in terms of implementation, they encapsulate functionality provided by other libraries, KinesisWorld inherits and calls Box2D functions and ASEngine calls angelscript functions. So far so good.
Now, when I am building an application that uses my library, as long as I don't include or use KinesisWorld, the linker won't attempt to link the Box2D library. As soon as I simply include KinesisWorld.h somewhere in the final executable source, it immediately complains of undefined references until I add Box2D to the linker. This is the behavior I always known and observed with other libraries as well.
Though, with angelscript, without including ASEngine.h at all, it will always complain about undefined references to angelscript, even if the final executable doesn't make any reference to it whatsoever.
Any idea of what can cause such thing? I tested it on Linux 32 bit and Windows 32 bit, with GCC and Visual Studio respectively, same behavior.

The linker doesn't care until the offending (declared but undefined) functions are called.
In this case, you aren't using any Box2D functions anywhere except in KinesisWorld but you ARE using angelscript somewhere in your library. Double check your ASEngine facade but in particular everywhere else (using the "find" feature across your entire project might help).

Related

Why can some libraries built by older compilers link against modern code, and others cannot?

We have a lot of prebuilt libraries (via CMake mostly), built using Visual Studio 2017 v141. When we try to use these against a project using Visual STudio 2019 v142 we see errors like:
Error C1047 The object or library file
‘boost_chrono-vc141-mt-gd-x32-1_68.lib’ was created by a different
version of the compiler than other objects...
On the other hand, we also use pre-compiled .libs from 3rd-party vendors which are over a decade old and these have worked just fine when linked against our codebase.
What determines whether a library needs to be rebuilt, and why can some ancient libraries still be used when others that are only one version behind cannot?
ABI incompatibilities could cause some issues. Even though the C++ standard calls for objects such as std::vector and std::mutex and that they need to have specific public/protected members, how these classes are made is left to the implementation.
In practice, it means that nothing prevents the GNU standard library from having their data fields in another orders than the LLVM standard library, or having completely different private members.
As such, if you try to use a function from a library built with the LLVM libc++ by sending it a GNU libstdc++ vector it causes UB. Even on the same standard library, different versions could have changed something and that could be a problem.
To avoid these issues, popular C++ libraries only use C data structures in their ABIs since (at least for now) every compiler produces the same memory layout for a char*, an int or a struct.
These ABI issues can appears in two places:
When you use dynamic libraries (.so and .dll files) your compiler probably won't say anything and you'll get undefined behavior when you call a function of the library using incompatible C++ objects.
When you use static libraries (.a and .lib files) I'm not really sure, I'm guessing it could either print an error if it sees there's gonna be a problem or successfully compile some Frankenstein monster of a binary that will behave like the above point
I will try to answer some integral parts, but be aware this answer could be incomplete. With more information from peers we will maybe be able to construct a full answer!
The simples kind of linking is linking towards a C library. Since there is no concept of classes and overloading function names, the compiler creators are able to create entry points to functions by their pure name. This seems to be pretty much quasi-standardized since, I myself, haven't encountered a pure C library not at least linkable to my projects. You can select this behaviour in C++ code by prepending a function declaration with extern "C". (This also makes it easy to link against a library from C# code) Here is a detailed explanation about extern "C". But as far as I am aware this behaviour is not standardized; it is just so simple - it seems - there is just one sane solution.
Going into C++ we start to encounter function, variable and struct names repeating. Lets just talk about overloaded functions here. For that compiler creators have to come up with some kind of mapping between void a(); void a(int x); void a(char x); ... and their respective library representation. Since this process also is not standardized (see this thread) and this process is far more complex than the 1 to 1 mapping of C, the ABIs of different compilers or even compiler versions can differ in any way.
Now given two compilers (or linkers I couldn't find a resource wich specifies wich one exactly is responsible for the mangling but since this process is not standardized it could be also outsourced to cthulhu) with different name mangling schemes, create following function entry points (simplified):
compiler1
_a_
_a_int_
_a_char_
compiler2
_a_NULL_
_a_++INT++_
_a_++CHAR++_
Different linkers will not understand the output of your particular process; linker1 will try to search for _a_int_ in a library containing only _a_++INT++_. Since linkers can't use fuzzy string comparison (this could lead to a apocalypse imho) it won't find your function in the library. Also don't be fooled by the simplicity of this example: For every feature like namespace, class, method etc. there has to be a method implemented to map a function name to a entry point or memory structure.
Given your example you are lucky you use libraries of the same publisher who coded some logic to detect old libraries. Usually you will get something along the lines of <something> could not be resolved or some other convoluted, irritating and/or unhelpful error message.
Some info and experience dump regarding Visual Studio and libraries in general:
In general the Visual C++ suite doesn't support crosslinked libs between different versions but you could be lucky and it works. Don't rely on it.
Since VC++ 2015 the ABI of the libraries is guaranteed by microsoft to be compatible as drescherjm commented: link to microsoft documentation
In general when using libraries from different suites you should always be cautious as n. 1.8e9-where's-my-share m. commented here (here is your share btw) about dependencies to other libraries and runtimes. In general in general not having the control over how libraries are built is a huge pita
Edit addressing memory layout incompatibilities in addition to Tzigs answer: different name mangling schemes seem to be partially intentional to protect users against linkage against incompatible libraries. This answer goes into detail about it. The relevant passage from gcc docs:
G++ does not do name mangling in the same way as other C++ compilers. This means that object files compiled with one compiler cannot be used with another.
This effect is intentional [...].
Error C1047
This is caused by /GL Global optimization or /LTGC Link Time Code Generation
These use information in the .obj, to perform global optimizations. When present, VS looks at the compiler which generated the original .lib, and if they are different emits the error. These compilation switches are for code from a single compiler, and not intended for cross version usage.
The other builds which work, don't have the switches, so are compatible.
Visual studio has started to use a new #pragma detect_mismatch
This causes an old build to identify it is incompatible with a new build, by detecting the version change.
Very old builds didn't have / support the pragma, so had no checking.
When you build a lib, its dependencies are loaded and satisified by the linker, but this is not a guarantee of working. The one-definition-rule signs the developer up to a contract, that within a compiled binary, all implementations of the same named function are the same. If this came from different compilers, that may not be true, and so the linker can choose any, causing latent bugs, where mixtures of old and new code are linkeded into the binary.
If the definition or implementation of std::string has changed, it may link, but have code which is flawed.
This new compiler check, causes a fail early, which I thoroughly approve of.

Undefined references when statically linking C++ and C libraries

I have 3 libraries I am trying to link (more than that, but all that is needed for this explanation). The "root" library is c++, which has the second library as a dependency for it and is also c++. The third library is c and is a dependent of the second library.
When linking this project I am getting undefined references to all methods in the third library(which is in c) being called from the second library(which is c++). The third library has it's headers properly enclosed in "extern "C"" as it should for this type of use.
While trying to troubleshoot this I found that the macro for notifying the library that it is being built as static wasn't properly set, so I fixed that and found that it changed another macro that was being placed in front of all c-style functions from an export to instead say "extern "C"". To recap, c-style methods in a c++ library were being declared with "extern "C"" in front of them after I fixed another macro. When I did this all of the c-style methods I was calling in the second library from the "root" library started getting undefined references.
I thought this was odd as I've seen no other library do this, so I commended out the portion of the line where the macro was defined to "extern "C"", instead leaving it blank. When I did this the undefined references to the c-style methods in the second library went away and the undefined references to the methods in the third library returned.
I have tried to research this myself and pretty much every result is "Put an "extern "C" in brackets around it!", however that is already the case here. I also considered it could be a linker order issue, and verified the linker order set in the command going to the linker is appropriate. So I am at a loss as to what is causing this. It seems to be a name mangling thing, but I can't for the life of me find how this is happening or how to fix it.
My question: What the hell is going on? What other avenues can I explore to try and resolve this?
I am on Windows XP 32-bit, compiling with MinGW. If you want to look at code...well that is a bit complicated since this is for a big project, but the root library is a game engine I am working on, the second library is cAudio, and the third library is OpenAL soft. Here is the root directory of the repo, here is the base directory for cAudio, and here is the base directory for OpenAL soft we are using.
I apologize for this being so long, thanks in advance to anyone that made it this far!
I managed to find the cause of the issue. The short version is that CMake was misconfigured.
The long version is that we had definitions being added manually via CMake when we try to build the library as static (which is the case here). There is one for cAudio, as well as OpenAL. When OpenAL compiled it was using definitions as provided in it's portion of the CMake script appropriately. These definitions were not shared with other projects in the hierarchy. So when it came time for cAudio to compile the export macro didn't resolve to being blank, instead resolving to have it set for "dllimport". When cAudio went to link it was expecting symbols prefixed with "_imp", which was not the case in the compiled library. Thus causing the undefined references I was getting.
I resolved this by adding the appropriate definition in the cAudio CMake so that the import macro's resolved correctly when cAudio was being compiled with OpenAL includes.

C++ Linker issues, is there a generalized way to troubleshoot these?

I know next to nothing about the linking process, and it almost always gets in the way when I am trying to start a new project or add a new library. Whenever I search for fixes to these type of errors, I will find people with a similar problem but rarely any sort of fix.
Is there any generalized way of going about finding what the problem is, and fixing it?
I'm using visual studio 2010, and am statically linking my libraries into my program. My problems always seem to stem from conflicts with LIBCMT(D).lib, MSVCRT(D).lib, and a few other libraries doublely defining certain functions. If it matters at all, my intent is to avoid using "managed" C++.
If your error is related to LIBCMT(D).lib and the like, usually that depends from the fact that you are linking against a library that uses a different CRT version than yours. The only real fix is to either use the library compiled for the same version of the CRT you use (often there is the "debug" and "release" version also for this reason), either (if you are desperate) change the CRT version you use to match the one of the library.
What is happening behind the scenes is that both your program and your library need the CRT functions to work correctly, and each one already links against it. If they are linking against the same version of it nothing bad happens (the linker sees that it's the same and doesn't complain), otherwise there are multiple conflicting implementations of the same functions, so the linker doesn't know which are right for which object modules (and also, since they are probably not binary compatible, internal data structures of the two CRTs will be incompatible).
The specific link errors you mentioned (with LIBCMT(D).lib, MSVCRT(D).lib libraries) are related to conflicts in code generation options between modules/libraries in your program.
When you compile a module, the compiler automatically inserts in the resulting .obj some references to the runtime libraries (LIBCMT&MSVCRT). Now, there is one version of these libraries for each code generation mode (I'm referring to the option at Configuration properties -> C/C++ -> Code Generation -> Runtime Library). So if you have two modules compiled with a different mode, each of them will reference a different version of the library, the linker will try to include both, and of course there'll be duplicated symbols, since essentially all the symbols are the same in these libraries, only their implementations differ.
The solution comes in three parts. First, make sure all the modules in a project use the same mode. Second, if you have dependencies between projects, all of them have to use the same mode. Third, if you use third-party libraries, you have to either know which mode they use (and adopt it) or be able to recompile them with the desired mode.
The last one is the most difficult. Sometimes, libraries come pre-compiled, and not always the provider gives information about the mode used. Worse, if you're using more than one third-party library, they may have conflicting modes. In those cases, you have no better option than trial-and-error.
Also notice that each Visual Studio version has its own set of runtime libraries, so when using third-party libraries you have to use those compiled with the same version of Visual Studio you're using. If the provider doesn't offer it, your only choice is to recompile yourself.

Why can CImg achieve this kind of effect?

The compilation is done on the fly :
only CImg functionalities really used
by your program are compiled and
appear in the compiled executable
program. This leads to very compact
code, without any unused stuffs.
Any one knows the principle?
CImg is a header-only library, and they use templates liberally, which is what they're referring to.
If they used a precompiled library of some kind (.dll/.lib/.a/.so) the library file would have to contain the entire CImg library, regardless of which bits of it you actually use.
In the case of a statically linked library (.lib or .a), the linker can then strip out unused symbols, but that may depend on optimization settings.
When the entire library is included in one or two headers, it is only actually compiled when you #include it, and so it is part of the same compilation process as the rest of your program, and the compiler can easily determine which parts of the library are used, and which ones aren't.
And because the CImg API uses templates, no code is generated for functions that are never called.
They are overselling it a bit though, because as the other answers point out, unused symbols will usually be stripped out anyway.
Sounds like fairly standard behaviour to me - a C++ linker will usually throw away any unused library references rather than including uncallable code. Similarly, an optimized build will not include uncallable code.
This sounds like MSVC's Eliminate Unreferenced Data (/OPT:REF) linker command, GCC should have something similar too this too

Building C++ source code as a library - where to start?

Over the months I've written some nice generic enough functionality that I want to build as a library and link dynamically against rather than importing 50-odd header/source files.
The project is maintained in Xcode and Dev-C++ (I do understand that I might have to go command line to do what I want) and have to link against OpenGL and SDL (dynamically in SDL's case). Target platforms are Windows and OS X.
What am I looking at at all?
What will be the entry point of my
library if it needs one?
What do I have to change in my code?
(calling conventions?)
How do I release it? My understanding
is that headers and the compiled
library (.dll, .dylib(, .framework),
whatever it'll be) need to be
available for the project -
especially as template functionality
can not be included in the library by
nature.
What else I need to be aware of?
I'd recommend building as a statc library rather than a DLL. A lot of the issues of exporting C++ functions and classes go away if you do this, provided you only intend to link with code produced by the same compiler you built the library with.
Building a static library is very easy as it is just an collection of .o/.obj files - a bit like a ZIP file but without compression. There is no need to export anything - just include the library in the list of files that your application links with. To access specific functions or classes, just include the relevant header file. Note you can't get rid of header files - the C++ compilation model, particularly for templates, depends on them.
It can be problematic to export a C++ class library from a dynamic library, but it is possible.
You need to mark each function to be exported from the DLL (syntax depends on the compiler). I'm poking around to see if I can find how to do this from xcode. In VC it's __declspec(dllexport) and in CodeWarrior it's #pragma export on/#pragma export off.
This is perfectly reasonable if you are only using your binary in-house. However, one issue is that C++ methods are named differently by different compilers. This means that nobody who uses a different compiler will be able to use your DLL, unless you are only exporting C functions.
Also, you need to make sure the calling conventions match in the DLL and the DLL's client. This either means you should have the same default calling convention flag passed to the compiler for both the DLL or the client, or better, explicitly set the calling convention on each exported function in the DLL, so that it won't matter what the default is for the client.
This article explains the naming issue:
http://en.wikipedia.org/wiki/Name_decoration
The C++ standard doesn't define a standard ABI, and that's bad news for people trying to build C++ libraries. This means that you get different behavior from your compiled code depending on which flags were used to compile it, and that can lead to mysterious bugs in code that compiles and links just fine.
This extends beyond just different calling conventions - C++ code can be compiled to support or not support RTTI, exception handling, and with various optimizations that can affect the the memory layout of class instances, which C++ code relies on.
So, what can you do? I would build C++ libraries inside my source tree, and make sure that they're built as part of my project's build, and that all the libraries and the code that links to them use the same compiler flags.
Note that name mangling, which was supposed to at least prevent you from linking object files that were compiled with different compilers/compiler flags only mostly works, and there are certain things you can do, especially with GCC, that will result in code that links just fine and fails at runtime.
You have to be extra careful with vendor supplied dynamic C++ libraries (QT on most Linux distributions, for example.) I've seen instances of vendor supplied libraries that were compiled in ways that prevented certain things from working properly. For example, some Redhat Linux releases (maybe all of them) disabled exceptions in QT, which made it impossible to catch exceptions in main() if the exceptions were thrown in a QT callback. Fun.