symbol resolutions when creating (and linking) libraries

symbol resolutions when creating (and linking) libraries - c++

Suppose a.cc defines a function f_a() that uses a function f_b() defined in b.cc. From a.cc and b.cc I create a dynamic library libdynamic.so.
Suppose the file main.cc uses f_a, I'd compile it as follows:
g++ -o main main.cc -ldynamic
How does the dynamic linker bring the definition of f_a (and subsequently f_b) into the executable? Is the definition of f_a in libdynamic.so already resolved with f_b? Or the dynamic linker will also resolve this (internal) dependency at runtime?

Since you're using a shared library (*.so), the definition is not brought into the executable. It remains in the library itself and is resolved at run time, which is why if you remove the shared library the program will not function correctly.
On the other hand, all the internal symbols in the library (in your example, f_a and f_b) must be resolved when the library is built. This is evident from the compilation process:
g++ -fPIC -c a.cc
g++ -fPIC -c b.cc
g++ -shared -Wl,-soname,libdynamic.so -o libdynamic.so a.o b.o
In the last stage, g++ calls the linker (ld) to link f_a.o and f_b.o. In fact, you could (probably) call the linker directly instead:
ld -shared -soname=libdynamic.so -o libdynamic.so a.o b.o
If you're still curious about the whole process and all its gory details, here is a useful reference article: Linkers and Loaders, by Sandeep Grover.

Basically Dynamic libraries are linked with the Executable file at Run time(That is when you are running ./main). The compiler will take care about the solving the dependency at run time. If you want to check the dependency is resolved or not by nm command. The default information that the ‘nm’ command provides is-
Virtual address of the symbol
A character which depicts the symbol type. If the character is in lower case then the symbol is local but if the character is in upper case then the symbol is external
Name of the symbol
For more information nm.
After compiling your program just execute nm exefilename(i think for your's nm main).

Related

Is it possible to link Linux static C++ library at runtime?

I know that the question is strange because we all know that a static .a library can be linked only at compile time.
I have a confidential code that I cannot share, but my question is what can let a code compiles and links against a static library successfully, but at runtime it complains about a missing symbol that is present in the .a lib which was linked with the code in the first place ?
What I can share is a little:
add_library(${NAME} STATIC ${NAME_SOURCES})
then this library is added to a global variable called LIBS that has all libraries needed to link to final binary.
I found the static library and I did an objdump on it and found the missing symbol.
So, it compiled the static lib then it compiled the final binary using that library, so why it complains about not finding it at runtime ?

There are many ways to cause this behaviour, here's one.
Suppose you have a shared library libA.so and a static library libB.a. Both export the same symbol foo.
// foo.c
void foo() {}
gcc -fPIC -shared -o libA.so foo.c
gcc -c foo.c && ar r libB.a foo.o
You link your program against both libraries:
// main.c
extern void foo();
int main() { foo(); }
gcc -o myexe main.c -L. -lA -lB -Wl,-rpath=. # dangerous, do not really do this
foo is resolved against libA.so.
Now suppose that at run time you have a different version of libA.so, one that does not export foo. Perhaps you have several versions of the library lying around, or maybe there is a software update.
// foo.c v2.0
void nofoo() {}
gcc -fPIC -shared -o libA.so foo.c
When you try to run myexe, the system will complain about
./myexe: symbol lookup error: ./myexe: undefined symbol: foo
myexe is linked againsg libB.a which defines foo, but as you can see, this doesn't help one little bit.

Same object file in different static libraries when linking

clang++ ... foo.cpp ... -o dir1/foo.o
clang++ ... foo.cpp ... -o dir2/foo.o
//The only difference beween the above two clang++ command lines
//is the output directory
llvm-ar ... dir1/lib1.a ... dir1/foo.o ...
llvm-ar ... dir2/lib2.a ... dir2/foo.o ...
clang++ ... dir1/lib1.a dir2/lib2.a ... -o lib.so
What happens to the duplicated symbols from foo.cpp when generating lib.so? Is any flag reqired to not to generate symbol duplication errors?

Linking multiple static libraries, when the same object file occurs in more than one of the provided libraries, will not result in any duplicate symbol errors (by default).
This is because the linker does not "combine the static libraries" into a final executable. It only combines the provided object files into the executable. The linker processes the list of object files and archive libraries left-to-right. When a static library is encountered, the linker checks to see if any of the library provided object files define a currently undefined symbol. Then, and only then, will pull in that object file.
In your example:
clang++ ... dir1/lib1.a dir2/lib2.a ... -o lib.so
consider two additional object files:
clang++ obj1.o dir1/lib1.a dir2/lib2.a obj2.o -o lib.so
If obj1.o references a symbol that exists in foo.cpp:
The linker will process and add obj1.o to the lib.so, noting that said symbol is undefined.
The linker will open dir1/lib1.a and check if any object files contained in the archive define said symbol. Because foo.o defines the symbol, foo.o will be added to lib.so and the symbol will be marked defined.
The linker will open dir2/lib2.a. But there are no currently undefined symbols so the duplicate object file will be ignored.
The linker will process and add obj2.o to the lib.so. The linker does not go back and re-processes lib1.a or lib2.a
Therefore no duplicate symbol error should be raised (by default, on Linux). To change this behaviour, you can use the linker option --whole-archive
clang++ ... -Wl,--whole-archive dir1/lib1.a dir2/lib2.a -Wl,--no-whole-archive ... -o lib.so
With --whole-archive all object files from the specified archive libraries will be added to the output. The above command then results in a "multiple definition" error for any symbols in foo.cpp.
This answer describes the behaviour on Linux, I believe AIX is different and will always add all encountered object files (from static libraries) to the output.

Using a shared library in another shared library

I am creating a shared library from a class from an example I got here C++ Dynamic Shared Library on Linux. I would like to call another shared library from the shared library created and then use it in the main program. So I have the myclass.so library and I want to call another library say anotherclass.so from the myclass.so library and then use this myclass.so library in the main program. Any idea on how I can do this please.

There is more than one way in which multiple shared libraries may be added to
the linkage of a program, if you are building all the libraries, and the program,
yourself.
The elementary way is simply to explicitly add all of the libraries to the
the linkage of the program, and this is the usual way if you are building only the
program and linking libraries built by some other party.
If an object file foo.o in your linkage depends on a library libA.so, then
foo.o should precede libA.so in the linkage sequence. Likewise if libA.so
depends on libB.so then libA.so should precede libB.so. Here's an illustration.
We'll make a shared library libsquare.so from the files:
square.h
#ifndef SQUARE_H
#define SQUARE_H
double square(double d);
#endif
and
square.cpp
#include <square.h>
#include <cmath>
double square(double d)
{
return pow(d,2);
}
Notice that the function square calls pow, which is declared in the
Standard header <cmath> and defined in the math library, libm.
Compile the source file square.cpp to a position-independent object file
square.o:
$ g++ -Wall -fPIC -I. -c square.cpp
Then link square.o into a shared library libsquare.so:
$ g++ -shared -o libsquare.so square.o
Next we'll make another shared library libcube.so from these files:
cube.h
#ifndef CUBE_H
#define CUBE_H
double cube(double d);
#endif
and
cube.cpp
#include <cube.h>
#include <square.h>
double cube(double d)
{
return square(d) * d;
}
See that the function cube calls square, so libcube.so is going to
depend on libsquare.so. Build the library as before:
$ g++ -Wall -fPIC -I. -c cube.cpp
$ g++ -shared -o libcube.so cube.o
We haven't bothered to link libsquare with libcube, even though libcube
depends on libsquare, and even though we could have, since we're building libcube.
For that matter, we didn't bother to link libm with libsquare. By default the
linker will let us link a shared library containing undefined references, and it
is perfectly normal. It won't let us link a program with undefined references.
Finally let's make a program, using these libraries, from this file:
main.cpp
#include <cube.h>
#include <iostream>
int main()
{
std::cout << cube(3) << std::endl;
return 0;
}
First, compile that source file to main.o:
$ g++ -Wall -I. -c main.cpp
Then link main.o with all three required libraries, making sure to list
the linker inputs in dependency order: main.o, libcube.so, libsquare.so, libm.so:
$ g++ -o prog main.o -L. -lcube -lsquare -lm
libm is a system library so there's no need to tell the linker where to look for
it. But libcube and libsquare aren't, so we need to tell the linker to look for
them in the current directory (.), because that's where they are. -L. does that.
We've successfully linked ./prog, but:
$ ./prog
./prog: error while loading shared libraries: libcube.so: cannot open shared object file: No such file or directory
It doesn't run. That's because the runtime loader doesn't know where to find libcube.so (or libsquare.so, though it didn't get that far).
Normally, when we build shared libraries we then install them in one of the loader's default
search directories (the same ones as the linker's default search directories), where they're available to any program, so this wouldn't happen. But I'm not
going to install these toy libraries on my system, so as a workaround I'll prompt the loader where to look
for them by setting the LD_LIBRARY_PATH in my shell.
$ export LD_LIBRARY_PATH=.
$ ./prog
27
Good. 3 cubed = 27.
Another and better way to link a program with shared libraries that aren't located
in standard system library directories is to link the program using the linker's
-rpath=DIR option. This will write some information into the executable to tell
the loader that it should search for required shared libraries in DIR before it tries
the default places.
Let's relink ./prog that way (first deleting the LD_LIBRARY_PATH from the shell so that it's not effective any more):
$ unset LD_LIBRARY_PATH
$ g++ -o prog main.o -L. -lcube -lsquare -lm -Wl,-rpath=.
And rerun:
$ ./prog
27
To use -rpath with g++, prefix it with -Wl, because it's an option for linker, ld,
that the g++ frontend doesn't recognise: -Wl tells g++ just to pass the
option straight through to ld.

I would like to add some points to the response of #Mike.
As you do not link libcube library with libsquare you are creating a sort of "incomplete library". When I say incomplete, I meant that when you link your application you must link it with both libcube and libsquare even though it does not use any symbol directly from libsquare.
It is better to link libcube directly with libsquare. This link will create the library with a NEEDED entry like:
readelf -d libcube.so
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libsquare.so]
Then when you link your application you can do:
g++ -o prog main.o -L. -lcube
Although, this will not link because the linker tries to locate the NEEDED library libsquare. You must precise its path by adding -Wl,-rpath-link=. to the linking command:
g++ -o prog main.o -L. -lcube -Wl,-rpath-link=.
Note: For runtime, you must still set LD_LIBRARY_PATH or link with rpath as mentioned by #Mike.

In your library if you are using any other shared library so simply your library user is also dependent on that library. While creating library you can use -l so the linker have notion for shared library and it will link when required.
But when you deliver your library as its dependent on some other library you need to export that too along with your and provide some environment variable or linker flag to load it from specified path (Your exported package). That will not lead any discrepancy other wise if its some standard library function user might get definition from his system's some other library and will lead in disastrous situation.

Simply use the library like you'd use it in any other application. You don't have to link to anotherclass.so, just to myclass.so.
However, you will have to make both libraries (myclass.so and anotherclass.so) available for your later application's runtime. If one of them is missing you'll get runtime errors just like it is with any other application.

Creating a Minimal Shared Library

For background, I'm creating some C++ software that uses dynamically loaded shared library plugins for hardware output (the specifics of it aren't relevant here).
I'm building the executable by compiling everything into object files and then linking the ones needed, which is simple using an exclusion list. I can then build the shared library by specifying its primary object file (the one that's dynamically loaded and accessed at runtime) along with every other object file referenced by the primary one.
My question is this: Is there a way to provide the linker with the primary object file, and create a shared library containing only the objects it depends upon? All of the object files are in the same directory, I'm not using a Makefile (yet; if one could solve the problem, it's a valid answer), and compilation speed isn't an issue.
I've looked into the linker options --as-needed, --gc-sections, and --no-undefined, but I haven't been able to piece together a working build process.
Example: For source files main.cpp, a.cpp, b.cpp, a.h, and b.h, where main.cpp and a.cpp both include b.h:
gcc -fPIC -c *.cpp -I. builds object files main.o, a.o, and b.o.
gcc -o main.out *.o builds the final executable main.out from the object files... including a.o, which is unused. (--gc-sections should fix this.)
gcc -fPIC -shared -o a.so a.o -Wl,--as-needed !(a).o builds the final shared library a.so from all of the object files... including main.o, which is unused. How do I prevent main.o from being included in a.so?

Is there a way to provide the linker with the primary object file, and create a shared library containing only the objects it depends upon?
Yes: package all objects into an archive library liball.a, then link like this:
gcc -shared -o a.so a.o liball.a
The linker will then pull out from liball.a all objects that a.o depends on, and only these objects, as explained here.
Note: liball.a may contain a.o, there is no harm (as above link explains).
Update:
Is there a way to do it without needing to create an archive first?
I don't know of any portable way to do that. The Gold linker has --start-lib and --end-lib command line flags that achieve exactly that.

g++: In what order should static and dynamic libraries be linked?

Let's say we got a main executable called "my_app" and it uses several other libraries: 3 libraries are linked statically, and other 3 are linked dynamically.
In which order should they be linked against "my_app"?
But in which order should these be linked?
Let's say we got libSA (as in Static A) which depends on libSB, and libSC which depends on libSB:
libSA -> libSB -> libSC
and three dynamic libraries:libDA -> libDB -> libDC (libDA is the basic, libDC is the highest)
in which order should these be linked? the basic one first or last?
g++ ... -g libSA libSB libSC -lDA -lDB -lDC -o my_app
seems like the currect order, but is that so? what if there are dependencies between any dynamic library to a static one, or the other way?

In the static case, it doesn't really matter, because you don't actually link static libraries - all you do is pack some object files together in one archive. All you have to is compile your object files, and you can create static libraries right away.
The situation with dynamic libraries is more convoluted, there are two aspects:
A shared library works exactly the same way as static library (except for shared segments, if they are present), which means, you can just do the same - just link your shared library as soon as you have the object files. This means for example symbols from libDA will appear as undefined in libDB
You can specify the libraries to link to on the command line when linking shared objects. This has the same effect as 1., but, marks libDB as needing libDA.
The difference is that if you use the former way, you have to specify all three libraries (-lDA, -lDB, -lDC) on the command line when linking the executable. If you use the latter, you just specify -lDC and it will pull the others automatically at link time. Note that link time is just before your program runs (which means you can get different versions of symbols, even from different libraries).
This all applies to UNIX; Windows DLL work quite differently.
Edit after clarification of the question:
Quote from the ld info manual.
The linker will search an archive only
once, at the location where it is
specified on the command line. If the
archive defines a symbol which was
undefined in some object which
appeared before the archive on the
command line, the linker will include
the appropriate file(s) from the
archive. However, an undefined symbol
in an object appearing later on the
command line will not cause the linker
to search the archive again.
See the `-(' option for a way to force
the linker to search archives multiple
times.
You may list the same archive multiple
times on the command line.
This type of archive searching is
standard for Unix linkers. However, if
you are using `ld' on AIX, note that
it is different from the behaviour of
the AIX linker.
That means:
Any static library or object that depends on other library should be placed before it in the command line. If static libraries depend on each other circularly, you can eg. use the -( command line option, or place the libraries on the command line twice (-lDA -lDB -lDA). The order of dynamic libraries doesn't matter.

This is the sort of question that's best solved by a trivial example. Really! Take 2 minutes, code up a simple example, and try it out! You'll learn something, and it's faster than asking.
For example, given files:
a1.cc
#include <stdio.h>
void a1() { printf("a1\n"); }
a2.cc
#include <stdio.h>
extern void a1();
void a2() { printf("a2\n"); a1(); }
a3.cc
#include <stdio.h>
extern void a2();
void a3() { printf("a3\n"); a2(); }
aa.cc
extern void a3();
int main()
{
a3();
}
Running:
g++ -Wall -g -c a1.cc
g++ -Wall -g -c a2.cc
g++ -Wall -g -c a3.cc
ar -r liba1.a a1.o
ar -r liba2.a a2.o
ar -r liba3.a a3.o
g++ -Wall -g aa.cc -o aa -la1 -la2 -la3 -L.
Shows:
./liba3.a(a3.o)(.text+0x14): In function `a3()':
/tmp/z/a3.C:4: undefined reference to `a2()'
Whereas:
g++ -Wall -g -c a1.C
g++ -Wall -g -c a2.C
g++ -Wall -g -c a3.C
ar -r liba1.a a1.o
ar -r liba2.a a2.o
ar -r liba3.a a3.o
g++ -Wall -g aa.C -o aa -la3 -la2 -la1 -L.
Succeeds. (Just the -la3 -la2 -la1 parameter order is changed.)
PS:
nm --demangle liba*.a
liba1.a:
a1.o:
U __gxx_personality_v0
U printf
0000000000000000 T a1()
liba2.a:
a2.o:
U __gxx_personality_v0
U printf
U a1()
0000000000000000 T a2()
liba3.a:
a3.o:
U __gxx_personality_v0
U printf
U a2()
0000000000000000 T a3()
From man nm:
If lowercase, the symbol is local; if uppercase, the symbol is global (external).
"T" The symbol is in the text (code) section.
"U" The symbol is undefined.

I worked in a project with a bunch of internal libraries that unfortunately depended on each other (and it got worse over time). We ended up "solving" this by setting up SCons to specify all libs twice when linking:
g++ ... -la1 -la2 -la3 -la1 -la2 -la3 ...

The dependencies for linking a library or executable have to be present at link-time, so you cannot link libXC before libXB is present. It doesn't matter if statically or dynamically.
Start with the most basic one, which has no (or just outside of your project) dependencies.

It's good practice to keep libraries independent of each other to avoid link order issues.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js