Will linking a static cpp lib and a dynamic cpp lib, both containing different versions of boost, violate ODR?
I am working on an iphone application. For final executable, I need to link a static library say libstatic1.a and a dynamic framwork say libdyanamic1.
libstatic1.a contains some version of boost, say boost 1.x and libdynamic1 contains another version of boost say boost 1.y. Now will final executable which links both of these, violate ODR rule?
Symbol visibility in libdynamic1:
I inspected symbols present in libdynamic1 using nm -g -C libdynamic1 and observed that symbols of boost threadpool and boost filesystem are present in the list.
If I am violating ODR, what are my options to handle the situation?
(So far I have tested the executable on multiple devices and have not experienced any issue.)
The standard only talks about "programs" where a "program" is a set of translation units "linked together", each consisting of a sequence of declarations [basic.link]. Arguing the ODR, which also concerns itself only with "programs", when it comes to questions involving dynamic libraries is not that straight forward. Since a "program" is required to contain a main function [basic.start.main]/1, a dynamic link library will generally not qualify as a "program" on its own.
Strictly speaking, I think a dynamic library would have to be viewed as just another set of translation units that are "linked" with the rest to form the final program. Thus, the program would really only be complete once all images have been loaded into memory and dynamic linking is finished (run-time dynamic linking would seem to further complicate the matter, but can be ignored here I guess). In this sense, the program described in your question (linking both the static and the dynamic library where each is using a different version of boost) will almost certainly be violating the ODR since you are going to have multiple translation units which are, e.g., using different definitions of the same entities [basic.def.odr]/12.
In practice, however, this issue is highly platform- and toolchain-dependent. At the ABI level, you typically find that the types of linkage a symbol can have are more differentiated than what you find at the language level within C++. On Windows, for example, you typically have to explicitly specify which symbols should be exported when building a dynamic library, all names are internal to the library by default. On ELF-based Linux, on the other hand, that is famously not the case. It would seem, however, that you can use the -fvisibility=hidden option to switch GCC to a more Windows-like default where your library will only export what you explicitly tell it to. Note that you must not have anything to do with boost in the interface your library exports as that will obviously lead to undefined behavior in your caseā¦
Related
Let's say we have a program that is statically linked against MyLib1.0. Also there is a shared library which is linked against MyLib1.1.
Now what happens if the program loads this shared library? My assumption is that during runtime we will have multiple different definitions of the same symbols.
Does both of the following scenarios violates ODR?
Shared lib exports all the symbols including the symbols of MyLib1.1
Shared lib hides every symbols of MyLib1.1
Does it matter at all if shared lib is dynamically linked or dynamically loaded?
There are some similar questions but I couldn't find a very clear answer so far.
From the point of view of the ISO C++ standard which defines the "one-definition-rule" there is no concept of static or dynamic linking. There are only translation units.
If your program contains translation units which define the same entity in two different ways (e.g. because the definition changed with version 1.1 of MyLib), then that is an ODR violation and the program has undefined behavior, irregardless of how you are linking the program.
Everything beyond that is specific to how linking works on the given platform, although of course a lot of it is common behavior, e.g. shared libraries can override symbols and have symbols local to itself.
Compilers have flags and attributes to specify this behavior, e.g. GCC has the -fvisibility flag and the __attribute__((visbility(/*...*/))) attribute. Specifing that a symbol is local to the shared library doesn't imply though that it will be compatible when another version is used in the program. For example if the memory layout of a class changed between versions and an object is passed between two parts of the program using the different versions, they are likely to access memory in an incompatible manner, resulting in undefined behavior in the practical sense (rather then the standard's one).
Whether a certain combination of linking different library versions works depends on a lot of factors. So it should only be done if the library author states that this is supported. They need to explicitly take care that changes to the library are binary/ABI-compatible. This is already true when using headers of a different version, not only when actually linking it.
I have a shared library that I've created that references a lot of C++ template functions. These symbols get entered into the shared library's export table as weak references (e.g. they show as type W when I view the shared library's symbols using nm). This means that at runtime, these symbols can possibly be interposed by copies from a different shared library that got loaded first.
It's important for my application that my shared library use the copies of these functions that are contained within the library itself, not from any other library. Is there any way to ensure this? It sounds to me like it would be tantamount to statically linking all of the various template instantiations into the shared library.
This means that at runtime, these symbols can possibly be interposed
by copies from a different shared library that got loaded first.
Note that they can be interposed regardless of weak attribute (see this GCC post which says that dynamic linker treats weaks similar to strongs, unless LD_DYNAMIC_WEAK is set, which usually isn't).
It's important for my application that my shared library
use the copies of these functions that are contained
within the library itself, not from any other library.
Is there any way to ensure this?
There are several things you can do.
The usually recommended approach is to add fvisibility=hidden to your CFLAGS to prevent exporting any symbols from your library and then mark the (hopefully very few) exported functions with __attribute__((visibility("default"))). This would also allow for better optimization at compile-time and faster start-up as rtld will need to process fewer symbols.
A poor man's limited solution would be to employ -fvisibility-inlines-hidden which is a limited form of -fvisility=hidden. It will only hide inline functions (e.g. resulting from STL templates).
In case you do not want to mess with source code, link with -Wl,-Bsymbolic - this would force references to be resolved within the library whenever possible.
-- EDIT --
Actually you'll need -Bsymbolic even if you enable -fvisibility=hidden to prevent other libraries (or executable itself) from dynamically interposing intra-library references to exported functions.
I commonly hear the term "to link against a library".
I'm new to compilers and thus linking, so I would like to understand this a bit more.
What does it mean to link against a library and when would not doing so cause a problem?
A library is an "archive" that contains already compiled code. Typically, you want to use a ready-made library to use some functionality that you don't want to implement on your own (e.g. decoding JPEGs, parsing XML, providing you GUI widgets, you name it).
Typically in C and C++ using a library goes like this: you #include some headers of the library that contain the function/class declarations - i.e. they tell the compiler that the symbols you need do exist somewhere, without actually providing their code. Whenever you use them, the compiler will place in the object file a placeholder, which says that that function call is to be resolved at link time, when the rest of the object modules will be available.
Then, at the moment of linking, you have to specify the actual library where the compiled code for the functions of the library is to be found; the linker then will link this compiled code with yours and produce the final executable (or, in the case of dynamic libraries, it will add the relevant information for the loader to perform the dynamic linking at runtime).
If you don't specify that the library is to be linked against, the linker will have unresolved references - i.e. it will see that some functions were declared, you used them in your code, but their implementation is nowhere to be found; this is the cause of the infamous "undefined reference errors".
Notice that all this process is identical to what normally happens when you compile a project that is made of multiple .cpp files: each .cpp is compiled independently (knowing of the functions defined in the others only via prototypes, typically written in .h files), and at the end everything is linked together to produce the final executable.
I have a limited knowledge of dynamic libraries and I usually have problems related to libraries that I do not understand.
I recently learned of libraries from google search and especially from the following links:
Difference between shared objects (.so), static libraries (.a), and DLL's (.so)?.
http://www.ibm.com/developerworks/library/l-dynamic-libraries/. That article was very useful in understanding the dynamic libraries and their usage:
If I understood well (correct me if I am wrong), there are two possible usages of shared objects:
dynamic linking: the shared object is automatically loaded by the dynamic linker when the program starts.
dynamic loading: the share object is loaded and used under the program control at runtime through the dynamic loading API (dlopen, dlerror, dlsym and dlclose). That option is useful for plugins.
If I got everything right, in the case of dynamic linking, all the symbols are verified at compilation time. This allows the compiler/linker to know exactly which shared object is effectively used by the program and which one is not used.
Now, it happens that the dynamic linker is always invoked at runtime even if the shared object is not used. It can be verified by linking an empty program against libraries that are not in locations searchable at runtime, and the execution will fail. Linking a program against library that is not actually used in the program can happen when there are updates and the use of a library is no longer necessary. It also happen when one isolates a part of the program for debugging, and link against all the libraries of the main program.
My question is: is there an option to ask the compiler/linker to not include reference to shared objects that do not have symbols referred to in the program?
Is there any issue that prevent the compiler from doing that?
The following posts share some similarities with the present question, but none of them has an accepted answer, nor an answer that satisfies my curiosity:
https://stackoverflow.com/questions/22617744/how-to-disable-the-runtime-checking-of-shared-object-if-they-are-not-used
Delay-Load equivalent in unix based systems
If you happen to use g++/ld there are a few suggestions spelled out on How to remove unused C/C++ symbols with GCC and ld?
For example:
gcc -Os -fdata-sections -ffunction-sections test.cpp -o test.o -Wl,--gc-sections
-dead_strip
-dead_strip_dylibs
However I'm actually not sure it's possible for the compiler to do this in the general case. Consider a dependent shared library that has a weak reference to the library that you want to remove from your link line: How would the compiler know that it's safe to remove the library and/or symbols at that point?
EDIT: I know about include guards, but include files are not the issue here. I'm talking about actual compiled and already linked code that gets baked into the static library.
I'm creating a general-purpose utility library for myself in C++.
One of the functions I'm creating, printFile, requires string, cout and other such members of the standard library.
I'm worried that when the library is compiled, and then linked to another project that also uses string and cout, the code for string and cout will be duplicated: it will both be prelinked in the library binary the program is being linked with, and it will be again linked with the project that uses them itself.
The library is structured like this:
There is one libname.hpp file the programmer who uses the library is supposed to #include in his projects.
For every function fname declared in libname.hpp, there is an file fname.cpp implementing it.
All fname.cpp files also #include "libname.hpp".
The library itself compiles into libname.a which is copied to /usr/lib/.
Will this even happen?
If yes, is it a problem at all?
If yes, then how can I avoid this?
I'm worried that when the library is compiled, and then linked to another project that also uses string and cout, the code for string and cout will be duplicated
Don't worry: no modern compilation system will do that. The code for template functions is emitted into object files, but the linker discards duplicate entries.
The library definitions of the standard C++ library won't show up in your own statically library unless you explicitly include them there (i.e., you extract object files from the standard C++ library and include them into your library). Static libraries are not linked at all and will just have undefined references to other libraries. A static library is merely a collection of object files defining the symbols provided by the library. The definitions which come from the headers, e.g., inline functions and template instantiations, will be defined in such a way that multiple definitions in multiple translation units won't conflict. Where the code isn't actually inlined, it will define "weak" symbols which result in duplicates being ignored or removed at link time.
The only real concern is that the libraries linked into an executable need to use compatible library definitions. With substantial amount of code residing in header files, there are relatively frequent changes to the C++ header files, including standard C++ library headers (relative to the C library headers which contain a lot less code).
Yes, the code for standard library things will be duplicated. It can be a problem if for example you return a std::string or take one as a parameter in one of your methods. It may have a different layout in your standard library implementation than in the user's.
This is rarely a problem in practice.
For static functions and inline templated functions defined in header files, there's nothing to worry about: every compilation unit gets its own copy (e.g. within the .a library there may already be many anonymous copies). This is okay because these definitions aren't exported, so the linker doesn't need to worry about them.
For functions that are declared with non-static linkage, whether you have an issue depends on how you link the .a library.
When you build the library, you typically will not link in the standard C++ library. The created library will contain undefined references to the standard C++ library. These must be resolved before building the final executable binary. This is normally done automatically when linking that final binary in the default way (depending on the compiler).
There are times when people do link in the standard C++ library into a static library. If you're linking against multiple static libraries that each embed another library (like the standard C++ library), then expect trouble if there are any differences in those embedded libraries. Fortunately, this is a rare problem, at least with the gcc toolchain. It's a more frequent problem with Microsoft's tools.
In some cases, a workaround is to make one or more conflicting static libraries into a dynamic library. This way each of these dynamic libraries can statically link its own copy of the problematic library. As long as the dynamic library doesn't export the symbols from the problematic library and there are no memory layout incompatibilities, there generally isn't any trouble.