I am trying to link a number of dynamic libraries into an application and running into problems with g++.
Consider:
libA.so
libB.so depends on libA.so
libC.so depends on libB.so
Application D depends directly on libC.so
If I try to link application D just to libC.so, I get unresolved symbols for the symbols in A and B. I feel as if the compiler should be able to figure it out, and when I use the intel compiler, it does. G++, however, can't figure out the linking. I would like my libraries and executables to only have to link to the things they directly need, not try to anticipate what the libraries they are using need.
I have also had problems when libA.so links to a static library, and when I try to compile the executable I get unresolved symbols from the static library that libA.so was supposed to be using.
I have seen a number of other people ask this and similar questions and get a variety of answers (Linking with dynamic library with dependencies), but the answers are all rather vague, often conflicting, and very much along the lines of "keep on trucking and RTFM".
I get the impression that link order matters. How so, and how do I know what order to link in?
Update
I believe that what is happening is something along the lines of libA.so contains two functions (AA and AB). libB.so needs AA and libC.so needs AB. When libB.so gets linked, g++ gets libA.so, sees that only AA is used, and drops AB. Then when libC.so is linked in, g++ sees that libA.so was already linked and doesn't revisit it, resulting in AB being undefined. I have seen documentation indicating that static libraries work this way, but would the compiler treat dynamic libraries the same way? If so, is there a way to work around it?
(You haven't shown the actual linker error, or provided nearly enough information about the problem, so what follows is partly guesswork...)
If I try to link application D just to libC.so, I get unresolved symbols for the symbols in A and B.
When linking an executable the GNU linker checks that all symbols are available. You can turn that off with --allow-shlib-undefined (to tell GCC to pass that to the linker use -Wl,--allow-shlib-undefined)
It is better not to use that option, but in that case the linker needs to know where to find libA.so and libB.so so it can check that the symbols needed by libC.so will be found. You can do that with the -rpath-link linker option
When using ELF or SunOS, one shared library may require another. This happens when an "ld -shared" link includes a shared library as one of the input files.
When the linker encounters such a dependency when doing a non-shared, non-relocatable link, it will automatically try to locate the required shared library and include it in the link, if it is not included explicitly.
So you should be able to fix the problem by using -Wl,-rpath-link,. to tell the linker to look in the current directory (.) for the libraries libC.so depends on.
I get the impression that link order matters. How so, and how do I know what order to link in?
Yes, link order matters. You should link in the obvious order ;-) If a file foo.cc depends on a library then put the library later in the linker line, so it will be found after processing foo.cc, and if that library depends on another library put that even later, so it will be processed after the earlier library that needs it. If you put a library at the start of the link line then the linker doesn't have any unresolved symbols to look up, so doesn't need to link to that library.
You need to explicitly specify all libraries that you directly use.
During static linking, the dependencies of the loaded .so are not used; when linking the main program, all symbols have to be found in either the main program itself, in a static library specified on the command line, or in a shared library specified on the command line.
This is where you get an error.
When the program is executed, the dependencies of dynamic libraries are loaded so that references from within other shared libraries can be resolved.
By the time the program runs, it might actually be linked (dynamically) against a different version of the shared library. This different version might have different dependencies, so the main program MUST NOT rely on the set of additional libraries that get loaded as dependencies.
This is why the static linker stops you early.
Related
I have libraries that link stdc++ dynamically. I want to create new shared library with new files, link them and link libstdc++ statically.
I tried to add -static-libstdc++ to the compilation but it doesn't work. I checked with ldd and my library is still dynamically linked.
How can I do it?
I have libraries that link stdc ++ dynamically. I want to create new shared library with new files, link them and link libstdc++ statically.
That is a really bad idea(TM). When your binary executes on a system with a different version of libstdc++.so.6, you will have symbol collisions (unless you are extremely careful about hiding all relevant symbols inside your shared library), which will likely lead to very hard to debug crashes or other undefined behavior.
I tried to add -static-libstdc++ to compilation but it doesn't work. I checked with ldd and my library is still dynamically linked.
First, adding -static-libstdc++ to compilation does nothing. You need to add to linking.
Second, it's unclear what you ran ldd on, and whether your library depends on other shared libraries. If it does, ldd will show transitive dependency on libstdc++, which is entirely expected.
To see whether your library directly depends on libstdc++.so.6, do this:
readelf -d yourlib.so | grep 'NEEDED.*libstdc'
Consider the following code structure:
main.cpp -> depends on libone.a -> depends on libtwo.a
Assume that in main.cpp only functions from libone.a are used. So realistically the programmer writing main.cpp really only cares about libone.a. At this point they don't even know libone.a has a dependency on libtwo.a.
They attempt to compile their code as follows and get linker errors:
g++ -o main main.cpp -lone
-- Error! Undefined symbols!
This becomes an issue because since libone.a depends on libtwo.a, anyone who uses libone.a must know about this dependency... As you can imagine this problem can occur with FAR more dependencies than a single library and can quickly become a linking nightmare.
Attempt 1 at solving this issue:
A first thought to solve this issue was "It's simple, i'll just link libone.a with libtwo.a when I compile libone.a!
It turns out it isn't as simple as I had hoped... When compiling libone.a there is no way to link libtwo.a. Static libraries don't link to anything when you compile them, instead all of the dependencies must be linked when the libraries are compiled into an executable.
For example, to compile main.cpp that depends on a static library that in turn depends on another static library, you must link both libraries. ALWAYS.
g++ -o main main.cpp -lone -ltwo
Attempt 2 at solving this issue:
Another thought was to try and compile libone as a dynamic library that links to libtwo.a.
Oddly enough this just worked! After compiling and linking libone.so the main program only needs to care about libone.so and doesn't need to know about libtwo.a anymore.
g++ -o main main.cpp -lone
Success!
After going through this exercise one piece is still missing. I just can't seem to figure out any reason why static libraries can't link in other libraries, but dynamic ones can. As a matter of fact, the dynamic library, libone.so would not compile at all until I linked libtwo.a. That's fine though, because as the author of libone.so I would know about its dependency on libtwo.a - The author of main.cpp, however would not know. And realistically they should not have to know.
So down to the real question... Why can dynamic libraries link to other libraries like this while static ones cannot? This seems to be an obvious advantage dynamic libraries have over static ones, but I've never seen it mentioned anywhere!
A static library is just an archive of object files, there is no concept of dependency because it was never linked.
Shared libraries are linked, solving symbols, and they can have, as such, dependencies.
Since your question refers to gcc and .so/.a files, I’ll assume you’re using some flavor of Unix that uses ELF files for object code.
After going through this exercise one piece is still missing. I just
can't seem to figure out any reason why static libraries can't link in
other libraries, but dynamic ones can.
Static libraries are not linked, as was mentioned in another answer. They are just an archive of compiled object files. Shared libraries are in fact linked, which means the linker actually resolves all the symbols reachable by any exported symbol. Think of exported symbols as the library’s API. A fully linked shared library contains either the definition of each symbol, or the dependency information necessary to tell the OS (specifically the dynamic loader) what other shared libraries are needed to have access to the symbol. The linker assembles all that into a special file format called an ELF shared object (dynamic library).
As a matter of fact, the dynamic library, libone.so would not compile
at all until I linked libtwo.a. That's fine though, because as the
author of libone.so I would know about its dependency on libtwo.a -
The author of main.cpp, however would not know. And realistically they
should not have to know.
libone.so probably compiles fine, but won’t link without libtwo due to unresolved symbols. Because the linker must resolve all reachable symbols when linking a shared library, it will fail if it can’t find any. Since libone.so uses symbols in libtwo, the linker needs to know about libtwo.a to find them. When you link a static library into a shared library, the symbols are resolved by copying the definitions directly into the output shared object file, so at this point, users of libone.so can be none the wiser about its usage of libtwo since its symbols are just in libone.so.
The other option is to link shared libraries into other shared libraries. If you are linking libtwo.so into libone.so (note the .so suffix), then the linker resolves the symbols needed by libone by adding a special section to the output shared object file that says it needs libtwo.so at runtime. Later, when the OS loads libone.so, it knows it also needs to load libtwo.so. And, if your application only uses libone directly, that’s all you need to tell the linker at build time, since it’ll link in libone, see that it needs libtwo, and recursively resolve until everything is good.
Now, all that loading at runtime the OS has to do incurs a performance cost, and there are some gotchas with global static variables that exist in multiple shared objects if you aren’t careful. There are some other potential performance benefits for linking statically that I won’t go into here, but suffice it to say that using dynamic libraries isn’t quite as performant on average, but that difference is also negligible for most real world situations.
I have a C++ file a.cpp with the library dependency in the path /home/name/lib and the name of the library abc.so.
I do the compilation as follows:
g++ a.cpp -L/home/name/lib -labc
This compiles the program with no errors.
However while running the program, I get the ERROR:
./a.out: error while loading shared libraries: libabc.so.0: cannot open shared object file: No such file or directory
However if before running the program, I add the library path as
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/name/lib;
and compile and run now, it works fine.
Why am I not able to link the library by giving it from the g++ command?
Because shared object libraries are linked at runtime - you either need to add the location to the library search path (as you've done), place it somewhere within an already existing path (like /usr/lib), or add a symbolic link to an existing library path that links to the location where it exists.
If you want to compile the library in so there is no runtime dependency, you'll need a static library (which would be abc.a) - note that this has numerous downsides, however (if the library is updated, you'll need to recompile your executable to incorporate that update).
Why am I not able to link the library by giving it from the g++ command?
You are able to link, and you did link the library succesfully. Otherwise you would not be able to build executable (in your case a.out). The problem you mixed 2 different things: linking with shared libraries and loading them at runtime. Loading shared libraries is a pretty complex concept and described pretty well here Program-Library-HOWTO read from 3.2.
You are linking dynamically, is the default behavior with GCC. LD_LIBRARY_PATH is used to specify directories where to look for libraries (is a way of enforce using an specific library), read: Program-Library-HOWTO for more info. There is also an ld option -rpath to specify libraries search path for the binary being compiled (this info is written in the binary and only used for that binary, the LD_LIBRARY_PATH affect other apps using the same library, probably expecting a new or old version).
Linking statically is possible (but a little tricky) and no dependency would be required (but sometimes is not recommended, because prevent the update of the dependent libraries, for example for security reason, in static linking your always are using the versions of the libraries you have when compiled the binary).
How does linker know which symbols should be resolved at runtime? Particularly I'm interested what information shared object files carry that instruct linker to resolve symbols at runtime. How does the dynamic symbol resolution work at runtime, i.e. what executable will do to find the symbol and in case multiple symbols with the same name were defined which would be found?
What happens if the file was linked only statically, but then it's linked dynamically at run-time as part of a shared library? Which symbol will be used by the executable? In other words, is that possible to override symbols in an executable by linking those symbols into a shared library?
The platform in question is SUN OS.
Try the below link. I hope it answers your question
http://www.linuxjournal.com/article/6463
Check out this article from Linux Journal. For more information -- perhaps specifically related to Windows, AIX, OSx, etc -- I would recommend the Wikipedia article on Linker (computing) and the references therein.
If a file is statically linked there is no run time resolution to speak of. If a shared object links to that same library either dynamically or statically, the version linked to the library will only effect code executed in the library. This can cause problems if you link to two different versions of the same library that are incompatible and shift data back and forth.
I have a C++ executable and I'm dynamically linking against several libraries (Boost, Xerces-c and custom libs).
I understand why I would require the .lib/.a files if I choose to statically link against these libraries (relevant SO question here). However, why do I need to provide the corresponding .lib/.so library files when linking my executable if I'm dynamically linking against these external libraries?
The compiler isn't aware of dynamic linking, it just knows that a function exists via its prototype. The linker needs the lib files to resolve the symbol. The lib for a DLL contains additional information like what DLL the functions live in and how they are exported (by name, by ordinal, etc.) The lib files for DLL's contain much less information than lib files that contain the full object code - libcmmt.lib on my system is 19.2 MB, but msvcrt.lib is "only" 2.6 MB.
Note that this compile/link model is nearly 40 years old at this point, and predates dynamic linking on most platforms. If it were designed today, dynamic linking would be a first class citizen (for instance, in .NET, each assembly has rich metadata describing exactly what it exports, so you don't need separate headers and libs.)
Raymond Chen wrote a couple blog entries about this specific to Windows. Start with The classical model for linking and then follow-up with Why do we have import libraries anyway?.
To summarize, history has defined the compiler as the component that knows about detailed type information, whereas the linker only knows about symbol names. So the linker ends up creating the .DLL without type information, and therefore programs that want to link with it need some sort of metadata to tell it about how the functions are exported and what parameter types they take and return.
The reason .DLLs don't have all the information you need to link with them directly is is historic, and not a technical limitation.
For one thing, the linker inserts the versions of the libraries that exist at link time so that you have some chance of your program working if library versions are updated. Multiple versions of shared libraries can exist on a system.
The linker has the job of validating that all your undefined symbols are accounted for, either with static content or dynamic content.
By default, then, it insists on all your symbols being present.
However, that's just the default. See -z, and --allow-shlib-undefined, and friends.
Perhaps this dynamic linking is done via import libraries (function has __declspec(dllimport) before definition).
If this is the way than compilator expects that there's __imp_symbol function declared and this function is responsible for forwarding call to the right library dynamically loaded.
Those functions are generated during linkage of symbols with __declspec(dllimport) keyword
Here is a very SIMPLIFIED description that may help. Static linking puts all of the code needed to run your program into the executable so everything is found. Dynamic linking means some of the required code does not get put into the executable and will be found at runtime. Where do I find it? Is function x() there? How do I make a call to function x()? That is what the library tells the linker when you are dynamically linking.