Same object file in different static libraries when linking - c++

clang++ ... foo.cpp ... -o dir1/foo.o
clang++ ... foo.cpp ... -o dir2/foo.o
//The only difference beween the above two clang++ command lines
//is the output directory
llvm-ar ... dir1/lib1.a ... dir1/foo.o ...
llvm-ar ... dir2/lib2.a ... dir2/foo.o ...
clang++ ... dir1/lib1.a dir2/lib2.a ... -o lib.so
What happens to the duplicated symbols from foo.cpp when generating lib.so? Is any flag reqired to not to generate symbol duplication errors?

Linking multiple static libraries, when the same object file occurs in more than one of the provided libraries, will not result in any duplicate symbol errors (by default).
This is because the linker does not "combine the static libraries" into a final executable. It only combines the provided object files into the executable. The linker processes the list of object files and archive libraries left-to-right. When a static library is encountered, the linker checks to see if any of the library provided object files define a currently undefined symbol. Then, and only then, will pull in that object file.
In your example:
clang++ ... dir1/lib1.a dir2/lib2.a ... -o lib.so
consider two additional object files:
clang++ obj1.o dir1/lib1.a dir2/lib2.a obj2.o -o lib.so
If obj1.o references a symbol that exists in foo.cpp:
The linker will process and add obj1.o to the lib.so, noting that said symbol is undefined.
The linker will open dir1/lib1.a and check if any object files contained in the archive define said symbol. Because foo.o defines the symbol, foo.o will be added to lib.so and the symbol will be marked defined.
The linker will open dir2/lib2.a. But there are no currently undefined symbols so the duplicate object file will be ignored.
The linker will process and add obj2.o to the lib.so. The linker does not go back and re-processes lib1.a or lib2.a
Therefore no duplicate symbol error should be raised (by default, on Linux). To change this behaviour, you can use the linker option --whole-archive
clang++ ... -Wl,--whole-archive dir1/lib1.a dir2/lib2.a -Wl,--no-whole-archive ... -o lib.so
With --whole-archive all object files from the specified archive libraries will be added to the output. The above command then results in a "multiple definition" error for any symbols in foo.cpp.
This answer describes the behaviour on Linux, I believe AIX is different and will always add all encountered object files (from static libraries) to the output.

Related

linking with two symbols. one defined in an archive file

I noticed that gtest provides a way to link again gtest_main so that end user doesn't need to write their own main function. This works in the following way. (A small example file named hello.cpp)
#include <gtest/gtest.h>
TEST(Hello, Basic) {}
One can compile this with:
g++ hello.cpp -lgtest -lgtest_main
and everything works out fine. The reason this works is that there is a main function defined in gtest_main.cc from which the libgtest_main.a is generated.
Now here is the thing. If I change my hello.cpp to
#include <gtest/gtest.h>
TEST(Hello, Basic) {}
int main(int argc, char** argv) {
testing::InitGoogleTest(&argc, argv);
return RUN_ALL_TESTS();
}
everything still works with the same command line! There are two main symbols now, and the linker has conveniently chosen the one main function which I defined in my hello.cpp.
What is the magic going on here?
No magic is going on. What you have observed is the normal default behaviour of
the linker.
A static library libxy.a is an ar archive of
object files x.o, y.o,...
If an object file x.o appears in the linker inputs of a program, the linker links it
into the program unconditionally.
If a static library libxy.a appears in the linker inputs, the linker examines the
archive to find any object files that provide definitions for symbols that have
already been referenced, but not already defined, in files already linked into
the program. It extracts just those object files, if any, from the archive and links
them into the program exactly as if they were individually named linker inputs
and the static library was not mentioned at all.
The usual reason that we offer a set of object files to the linker in a static library,
rather than as individual inputs, is so that the linker will select just the ones
it needs to obtain definitions for unresolved symbol references, rather than simply
linking all of them into the program whether they are needed or not.
Here is a elementary illustration in C1:-
main.c
extern void x(void);
int main(void)
{
x();
return 0;
}
lib_main.c
extern void y(void);
int main(void)
{
y();
return 0;
}
x.c
#include <stdio.h>
void x(void)
{
puts(__func__);
}
y.c
#include <stdio.h>
void y(void)
{
puts(__func__);
}
Compile all those to object files:
$ gcc -Wall -c main.c lib_main.c x.c y.c
Make a static library containing lib_main.o, x.o and y.o:
$ ar rcs libmxy.a lib_main.o x.o y.o
Link a program prog like this:
$ gcc -o prog main.o libmxy.a
It runs like:
$ ./prog
x
So the definition of main provided by main.o was linked and the other
definition of main in libmxy.a(lib_main.o) was ignored. Repeating the linkage
with some diagnostics sheds more light.
$ gcc -o prog main.o libmxy.a -Wl,-trace,-trace-symbol=main,-trace-symbol=x
/usr/bin/ld: mode elf_x86_64
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/crti.o
/usr/lib/gcc/x86_64-linux-gnu/7/crtbeginS.o
main.o
(libmxy.a)x.o
libgcc_s.so.1 (/usr/lib/gcc/x86_64-linux-gnu/7/libgcc_s.so.1)
/lib/x86_64-linux-gnu/libc.so.6
(/usr/lib/x86_64-linux-gnu/libc_nonshared.a)elf-init.oS
/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
libgcc_s.so.1 (/usr/lib/gcc/x86_64-linux-gnu/7/libgcc_s.so.1)
/usr/lib/gcc/x86_64-linux-gnu/7/crtendS.o
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/crtn.o
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o: reference to main
main.o: definition of main
main.o: reference to x
libmxy.a(x.o): definition of x
The -trace option asks the linker to show us what files were actually used in
the linkage. -trace-symbol=name asks the linker to show us the files in which
symbol name was defined or referenced. Most of the files linked are boilerplate
that gcc adds to the linker commandline by default. The ones that we built are:
main.o
(libmxy.a)x.o
The linker found the symbol main first referenced in the boilerplate object
file /usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o. Then
it found a definition of main in the object file main.o, which was linked
unconditionally. That resolved main. The linker didn't search libmxy.a for
another definition of main because it didn't need one.
In main.o it found an undefined reference to x and the next linker input
was libmxy.a. So it seached the object files in that archive for one that
defines x. It found libmxy.a(x.o) and extracted and linked it. Then it was
done.
The other object files that we offered to the linker in libmxy.a:
libmxy.a(lib_main.o)
libmxy.a(y.o)
were not needed. They might as well not have existed. The linkage is exactly
the same as:
$ gcc -o prog main.o x.o
$ ./prog
x
What is more interesting about libgtest_main.a...
... is the fact that here you have a static library that contains a member (libgtest_main.a(gtest_main.cc.o)) that will be linked
into your program even if your linkage does not input any object files before
libgtest_main.a:
$ g++ -o prog -lgtest_main -pthread
links successfully, and prog will run just to say that it has nothing to do.
If -lgtest_main is the very first linker input, then when the linker considers
it, it cannot have discovered any undefined references in files already linked,
since there are none, and therefore has no need to link any object file within
libgtest_main.a. But it does, and that behaviour might be described as a bit of
magic.
But we've already seen the explanation in the diagnostic output of:
$ gcc -o prog main.o libmxy.a -Wl,-trace,-trace-symbol=main,-trace-symbol=x
which informed us that main is first referenced in /usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o.
That boilerplate object file is the GCC C runtime startup code, which performs standard initializations for program
execution and finishes by calling main. This is an object file, so it will be linked
unconditionally, and GCC places it before all other inputs in the generated linker commandline. Link in verbose
mode (gcc -v ...) to see that. So in fact there is always an object file, first in the program's linkage,
that makes reference to main, no matter what object files you explicitly link. And if you
do not yourself input an object file that defines main before you input libraries, then
the linker will search libraries for a definition of main. libgtest_main exploits that fact.
Of course, it is only practical to exploit this fact for googletest because for all normal
programs that link googletest, the definition of main is identical.
[1] The choice of C rather than C++ makes no difference, except that in C we
don't have to bother about name-mangling.

Using a shared library in another shared library

I am creating a shared library from a class from an example I got here C++ Dynamic Shared Library on Linux. I would like to call another shared library from the shared library created and then use it in the main program. So I have the myclass.so library and I want to call another library say anotherclass.so from the myclass.so library and then use this myclass.so library in the main program. Any idea on how I can do this please.
There is more than one way in which multiple shared libraries may be added to
the linkage of a program, if you are building all the libraries, and the program,
yourself.
The elementary way is simply to explicitly add all of the libraries to the
the linkage of the program, and this is the usual way if you are building only the
program and linking libraries built by some other party.
If an object file foo.o in your linkage depends on a library libA.so, then
foo.o should precede libA.so in the linkage sequence. Likewise if libA.so
depends on libB.so then libA.so should precede libB.so. Here's an illustration.
We'll make a shared library libsquare.so from the files:
square.h
#ifndef SQUARE_H
#define SQUARE_H
double square(double d);
#endif
and
square.cpp
#include <square.h>
#include <cmath>
double square(double d)
{
return pow(d,2);
}
Notice that the function square calls pow, which is declared in the
Standard header <cmath> and defined in the math library, libm.
Compile the source file square.cpp to a position-independent object file
square.o:
$ g++ -Wall -fPIC -I. -c square.cpp
Then link square.o into a shared library libsquare.so:
$ g++ -shared -o libsquare.so square.o
Next we'll make another shared library libcube.so from these files:
cube.h
#ifndef CUBE_H
#define CUBE_H
double cube(double d);
#endif
and
cube.cpp
#include <cube.h>
#include <square.h>
double cube(double d)
{
return square(d) * d;
}
See that the function cube calls square, so libcube.so is going to
depend on libsquare.so. Build the library as before:
$ g++ -Wall -fPIC -I. -c cube.cpp
$ g++ -shared -o libcube.so cube.o
We haven't bothered to link libsquare with libcube, even though libcube
depends on libsquare, and even though we could have, since we're building libcube.
For that matter, we didn't bother to link libm with libsquare. By default the
linker will let us link a shared library containing undefined references, and it
is perfectly normal. It won't let us link a program with undefined references.
Finally let's make a program, using these libraries, from this file:
main.cpp
#include <cube.h>
#include <iostream>
int main()
{
std::cout << cube(3) << std::endl;
return 0;
}
First, compile that source file to main.o:
$ g++ -Wall -I. -c main.cpp
Then link main.o with all three required libraries, making sure to list
the linker inputs in dependency order: main.o, libcube.so, libsquare.so, libm.so:
$ g++ -o prog main.o -L. -lcube -lsquare -lm
libm is a system library so there's no need to tell the linker where to look for
it. But libcube and libsquare aren't, so we need to tell the linker to look for
them in the current directory (.), because that's where they are. -L. does that.
We've successfully linked ./prog, but:
$ ./prog
./prog: error while loading shared libraries: libcube.so: cannot open shared object file: No such file or directory
It doesn't run. That's because the runtime loader doesn't know where to find libcube.so (or libsquare.so, though it didn't get that far).
Normally, when we build shared libraries we then install them in one of the loader's default
search directories (the same ones as the linker's default search directories), where they're available to any program, so this wouldn't happen. But I'm not
going to install these toy libraries on my system, so as a workaround I'll prompt the loader where to look
for them by setting the LD_LIBRARY_PATH in my shell.
$ export LD_LIBRARY_PATH=.
$ ./prog
27
Good. 3 cubed = 27.
Another and better way to link a program with shared libraries that aren't located
in standard system library directories is to link the program using the linker's
-rpath=DIR option. This will write some information into the executable to tell
the loader that it should search for required shared libraries in DIR before it tries
the default places.
Let's relink ./prog that way (first deleting the LD_LIBRARY_PATH from the shell so that it's not effective any more):
$ unset LD_LIBRARY_PATH
$ g++ -o prog main.o -L. -lcube -lsquare -lm -Wl,-rpath=.
And rerun:
$ ./prog
27
To use -rpath with g++, prefix it with -Wl, because it's an option for linker, ld,
that the g++ frontend doesn't recognise: -Wl tells g++ just to pass the
option straight through to ld.
I would like to add some points to the response of #Mike.
As you do not link libcube library with libsquare you are creating a sort of "incomplete library". When I say incomplete, I meant that when you link your application you must link it with both libcube and libsquare even though it does not use any symbol directly from libsquare.
It is better to link libcube directly with libsquare. This link will create the library with a NEEDED entry like:
readelf -d libcube.so
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libsquare.so]
Then when you link your application you can do:
g++ -o prog main.o -L. -lcube
Although, this will not link because the linker tries to locate the NEEDED library libsquare. You must precise its path by adding -Wl,-rpath-link=. to the linking command:
g++ -o prog main.o -L. -lcube -Wl,-rpath-link=.
Note: For runtime, you must still set LD_LIBRARY_PATH or link with rpath as mentioned by #Mike.
In your library if you are using any other shared library so simply your library user is also dependent on that library. While creating library you can use -l so the linker have notion for shared library and it will link when required.
But when you deliver your library as its dependent on some other library you need to export that too along with your and provide some environment variable or linker flag to load it from specified path (Your exported package). That will not lead any discrepancy other wise if its some standard library function user might get definition from his system's some other library and will lead in disastrous situation.
Simply use the library like you'd use it in any other application. You don't have to link to anotherclass.so, just to myclass.so.
However, you will have to make both libraries (myclass.so and anotherclass.so) available for your later application's runtime. If one of them is missing you'll get runtime errors just like it is with any other application.

c++ & OpenMP : undefined reference to GOMP_loop_dynamic_start

I'm stuck in the following problem : at first I compile the following file cancme.cpp :
void funct()
{
int i,j,k,N;
double s;
#pragma omp parallel for default(none) schedule(dynamic,10) private(i,k,s) shared(j,N)
for(i=j+1;i<N;i++) {}
}
by:
mingw32-g++.exe -O3 -std=c++11 -mavx -fopenmp -c C:\pathtofile\cancme.cpp -o C:\pathtofile\cancme.o
Next I build a second file, test.cpp to simply link cancme.o with :
int main()
{
return(0);
}
by:
mingw32-g++.exe -O3 -std=c++11 -mavx -fopenmp -c C:\pathtofile\test.cpp -o C:\pathtofile\test.o
When linking it with cancme.o, by :
mingw32-g++.exe -o C:\pathtofile\test.exe C:\pathtofile\test.o -lgomp C:\pathtofile\cancme.o
I get the following error messages :
C:\pathtofile\cancme.o:cancme.cpp:(.text+0x39): undefined reference to `GOMP_loop_dynamic_start'
C:\pathtofile\cancme.o:cancme.cpp:(.text+0x49): undefined reference to `GOMP_loop_dynamic_next'
C:\pathtofile\cancme.o:cancme.cpp:(.text+0x52): undefined reference to `GOMP_loop_end_nowait'
C:\pathtofile\cancme.o:cancme.cpp:(.text+0x92): undefined reference to `GOMP_parallel_start'
C:\pathtofile\cancme.o:cancme.cpp:(.text+0x9f): undefined reference to `GOMP_parallel_end'
Does anyone have an idea about what's going wrong there??? The OpenMP library is correctly linked by the -lgomp flag but it is like it was not recognized.
note : I use MingW 4.8.1 c++ compiler under windows 7 64 bit:
thank you
renato
The GNU linker is a single pass linker. It means that it only resolves symbols that it has seen before it has reached the object file that defines the corresponding symbol. That means, if object file foo.o refers symbols from library libbar.a, then having ... foo.o -lbar ... will result in successful link since the undefined references seen in foo.o are satisfied during the processing of libbar.a (as long as no other object listed after -lbar refers symbols from the library). The opposite, i.e. ... -lbar foo.o ... won't work since once the linker has processed libbar.a, it will no longer search it while trying to resolve the references in foo.o.
On Unix systems that support dynamic link libraries (e.g. Linux, FreeBSD, Solaris, etc.), this is often not the case since -lbar will first look for the dynamic version of the library, e.g. libbar.so, and only if not found would try to link against the static libbar.a. When linking with dynamic libraries, the order doesn't matter since the unresolved references are handled later by the runtime link editor.
On Windows, linking against dynamic link libraries (DLLs) require the so-called import libraries to be statically linked into the executable. Therefore, even if the external dependencies are handled by the runtime linker, one still needs to properly order the static import libraries. libgomp.a is one such library.
Note that this behaviour is specific to the GNU linker that is part of MinGW. The Microsoft linker behaves differently: when resolving a reference, it first searches the libraries listed after the object file and then the libraries listed before it.
You made a mistake in compile command:
mingw32-g++.exe -o C:\pathtofile\test.exe C:\pathtofile\test.o -lgomp C:\pathtofile\cancme.o
If you'd use some makefile generators such as CMake for example you not get any linking error.
Don't mess compiler parameters like this: object_file_1 linker_flag_1 object_file_2.
Correct command to compile would be this:
mingw32-g++.exe -o C:\pathtofile\test.exe C:\pathtofile\test.o C:\pathtofile\cancme.o -lgomp

symbol resolutions when creating (and linking) libraries

Suppose a.cc defines a function f_a() that uses a function f_b() defined in b.cc. From a.cc and b.cc I create a dynamic library libdynamic.so.
Suppose the file main.cc uses f_a, I'd compile it as follows:
g++ -o main main.cc -ldynamic
How does the dynamic linker bring the definition of f_a (and subsequently f_b) into the executable? Is the definition of f_a in libdynamic.so already resolved with f_b? Or the dynamic linker will also resolve this (internal) dependency at runtime?
Since you're using a shared library (*.so), the definition is not brought into the executable. It remains in the library itself and is resolved at run time, which is why if you remove the shared library the program will not function correctly.
On the other hand, all the internal symbols in the library (in your example, f_a and f_b) must be resolved when the library is built. This is evident from the compilation process:
g++ -fPIC -c a.cc
g++ -fPIC -c b.cc
g++ -shared -Wl,-soname,libdynamic.so -o libdynamic.so a.o b.o
In the last stage, g++ calls the linker (ld) to link f_a.o and f_b.o. In fact, you could (probably) call the linker directly instead:
ld -shared -soname=libdynamic.so -o libdynamic.so a.o b.o
If you're still curious about the whole process and all its gory details, here is a useful reference article: Linkers and Loaders, by Sandeep Grover.
Basically Dynamic libraries are linked with the Executable file at Run time(That is when you are running ./main). The compiler will take care about the solving the dependency at run time. If you want to check the dependency is resolved or not by nm command. The default information that the ‘nm’ command provides is-
Virtual address of the symbol
A character which depicts the symbol type. If the character is in lower case then the symbol is local but if the character is in upper case then the symbol is external
Name of the symbol
For more information nm.
After compiling your program just execute nm exefilename(i think for your's nm main).

Undefined symbol when trying to load a library with dlopen

I'm trying to load a shared library (plugin) I was provided (closed source) with dlopen under a Linux ARM platform. I'm trying to load this way:
void* handle = dlopen(<library_path>/<library_name>, RTLD_NOW);
The result is a failure with this message:
Failed to load <library_path>/<library_name>: undefined symbol: <symbol_name>.
I tried to look inside the library with nm, but it seems the lib was stripped, no symbol could be found. I also tried using readelf -s, and, in fact, I got this result:
12663: 00000000 0 NOTYPE GLOBAL DEFAULT UND <symbol_name>
By reading around, I get that readelf -s returns all the symbols, including those symbols defined in libraries referenced by it.
The answers to this question are not completely clear to me: is this a symbol which is supposed to be in the library and which is not there because it was compiled the wrong way or is this a symbol I'm supposed find somewhere else? The output of readelf -d seems to suggest I'm providing all the needed shared libraries. May this error be related to a mistake in the way I'm compiling my executable or is this something not related to the loader?
Also, I read about the meaning of each column, but those values are quite strange. How do you interpret that symbol description? Why is address 0? Why is type NOTYPE?
undefined symbol: X means always that X should be exported from one of loaded libraries, but it's not. You should find out in which library requested symbol is and link to it.
You should know that this message is always result of problem with library, it's not fault. Library should know how to get all it's symbols. If it doesn't you can link your executable to required library so when you load your plugin, requested symbol is already known.
This error might have more complex reason. In case when both plugin and main app are linking to library, then attempts to link it might end with undefined symbols anyway. This might happen if main app and plugin are using different version of library (namely plugin uses newer one). Then at the point of loading plugin older version is already loaded, so loader assumes everything is ok, but newer version might contain new symbols. If plugin uses them, you will get undefined symbol errors.
This problem appears also if the order of the static libraries in the linking command is wrong for the app. The Unix ld linker requires that the library which implements a function is specified after the library which refers the function.
I got this trouble when I was trying to build libtesseract shared library taking libz library from a custom location (not a standard libz from the host, but manually built from source as well). I have put an example below:
Wrong linking order (-lz before -llept):
$ g++ -fPIC -DPIC -shared -nostdlib /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/5/crtbeginS.o -Wl,--whole-archive ....(some libs) -Wl,--no-whole-archive -L/home/build/jenkins/workspace/tesseract/zlib/bin/lib -L/home/build/jenkins/workspace/tesseract/leptonica/bin/lib -L/usr/lib/gcc/x86_64-linux-gnu/5 -L/usr/lib/x86_64-linux-gnu -lz -llept -lstdc++ -lm -lc -lgcc_s /usr/lib/gcc/x86_64-linux-gnu/5/crtendS.o /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crtn.o -g -O2 -Wl,-soname -Wl,libtesseract.so.4 -o .libs/libtesseract.so.4.0.1
Check with "nm -D":
$ nm -D .libs/libtesseract.so.4.0.1 | grep deflateInit
U deflateInit_
Check with "dlopen":
Cannot load ./tesseract/src/api/.libs/libtesseract.so.4.0.1 (./tesseract/src/api/.libs/libtesseract.so.4.0.1: undefined symbol: deflateInit_)
It happens because the linker is processing in the loop all static libraries passed in the command line and skipping those which are not used by any of the preceeding ones. Since on the moment of checking of libz.a the linker sees that all of already checked libraries do not use any function from libz.a the linker just "forgets" libz.a.
Proper linking order (-lz after -llept):
$ g++ -fPIC -DPIC -shared -nostdlib /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/5/crtbeginS.o -Wl,--whole-archive ....(some libs) -Wl,--no-whole-archive -L/home/build/jenkins/workspace/tesseract/zlib/bin/lib -L/home/build/jenkins/workspace/tesseract/leptonica/bin/lib -L/usr/lib/gcc/x86_64-linux-gnu/5 -L/usr/lib/x86_64-linux-gnu -llept -lz -lstdc++ -lm -lc -lgcc_s /usr/lib/gcc/x86_64-linux-gnu/5/crtendS.o /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crtn.o -g -O2 -Wl,-soname -Wl,libtesseract.so.4 -o .libs/libtesseract.so.4.0.1
Check with "nm -D":
$ nm -D .libs/libtesseract.so.4.0.1 | grep deflateInit
000000000041fb5b T deflateInit_
000000000041fba3 T deflateInit2_
"dlopen" did not show this error this time.