How are external symbols resolved? - c++

I have two files 37064544_p1.cpp & 37064544_p2.cpp with the same content as shown below :
int add(int x,int y)
{
return x+y;
}
I compiled them using
g++ -c 37064544_p2.cpp -o 37064544_p2.o
g++ -c 37064544_p2.cpp -o 37064544_p2.o
and added them to an archive using
ar -rsc lib37064544pf.a 37064544_p1.o 37064544_p2.o
And
$ nm -s lib37064544pf.a
gives me :
Archive index:
_Z3addii in 37064544_p1.o
_Z3addii in 37064544_p2.o
37064544_p1.o:
0000000000000000 T _Z3addii
37064544_p2.o:
0000000000000000 T _Z3addii
and
$ ar -t lib37064544pf.a
gives me
37064544_p1.o
37064544_p2.o
I have a driver which calls the _Z3addii function which is compiled with
g++ -static 37064544driver.cpp -o 37064544driver.elf -L. -l37064544pf
Result is
Sum : 11
Questions
How is the symbol _Z3addii resolved ?
Is it according to archive index?
Is it according to the order in which we populate the archive using ar?
How can I change this order?
How can I prevent ar from having duplicate symbols?
Compiler : g++ 4.6.3

How is the symbol _Z3addii resolved ?
The implementation is free to do whatever it likes, you are violating the one definition rule.
Realistically it'll stop looking for any given symbol after the first match, which presumably follows the order the files were inserted to the archive.
How can I change this order?
With ar you can use the a (after) and b (before) modifiers to position object files in the archive when inserting them, you're still violating the ODR though.
How can I prevent ar from having duplicate symbols?
You can't as far as I know, ar is relatively dumb and for good reason as some languages do allow for identical symbols, which is why you don't have any errors when linking with the archive (no diagnostic is required for ODR violations).
You can either force ld to read the entire archive
g++ -static 37064544driver.cpp -o 37064544driver.elf -L. \
-Wl,--whole-archive -l37064544pf -Wl,--no-whole-archive
Or you can do a partial link instead of a traditional archive which will give you an error if there are any duplicates
ld -r -o lib37064544pf.a 37064544_p1.o 37064544_p2.o

Related

I can have two main function in two static library of C++ when I link both of them?

I have three files:
test.cpp (it is empty) :
main1.cpp
int main()
{
printf("main_1\n");
return 0;
}
main2.cpp
int main()
{
printf("main_2\n");
return 0;
}
then I create two static library main1.a and main1.a.
g++ -c main1.cpp
ar r main1.a main1.o
g++ -c main2.cpp
ar r main2.a main2.o
I found that the output will different depends on the order of main1.a and main2.a as
main1.a is in front of main2.a
$ g++ -o out test.cpp main1.a main2.a
$ ./out
the output is "main_1"
main2.a is in front of main1.a
$ g++ -o out test.cpp main2.a main1.a
$ ./out
the output is "main_2"
why it will not have the error message "multiple definition of `main'" as the command?:
g++ -o out test.cpp main1.cpp main2.cpp
The linker handles these two cases very differently (by default). When compiling code into object files and linking them the linker will not allow two strong definitions to exist. Strong here is for example a function or a variable with a set value. It will allow one strong and multiple weak ones (like a variable with a set value in one object and another with the same variable but no set value).
With static libraries the linker goes through them in the order they’re given. It looks up the exports, if any of them are needed by the other linked objects it takes it in and restarts the process in that file (to find the functions the just found function possibly needs). So in this process when it gets to the first library it checks for its exports, sees main, determines it’s needed and takes it in. Then it goes to the second library, sees main, doesn’t see it in the list of undefined symbols and just skips it.
More information can be found on Eli Bendresky’s website

Force unused function to export in shared library

printme() and getme() defined in top.cpp and top.h
I used printme() function in test.cpp (test cpp file) in main function
g++34 -c top.cpp -fPIC
ar rcs libtop.a top.o
g++34 -c test.cpp -fPIC
g++34 -shared -o ltop.so -ltop -L. -fPIC
getme is not getting exported in ltop.so
How i can force getme function exported in ltop.so
When i do nm ltop.so
it is not showing getme symbol
i want to force this
Note: file can have multiple unused function like - getme()
I want to force all to export to so library
Normally, when linking with a static library only the modules in the static library that contain an unresolved symbol end up getting linked.
Here, since there is no unresolved reference to getme(), hence this module does not get linked from the static library. The solution is to explicitly make it unresolved.
A minor complicating factor is the C++ symbol name mangling. It is necessary to figure out what is the mangled symbol name for the getme() function. The easiest way is to look at the library with the nm command:
$ nm libtop.a
top.o:
0000000000000000 T _Z5getmei
Ok, so the mangled symbol name is _Z5getmei. The -u linker flag forces an unresolved reference to the indicated symbol to be used when linking:
g++ -shared -o ltop.so -L. -ltop -Wl,-u -Wl,_Z5getmei
The documentation for the -u option is found in the ld man page. This includes the module in the shared library:
$ nm ltop.so | grep getme
0000000000000680 T _Z5getmei
Don't use your static library to create your dynamic one. Instead, use the component object file directly:
g++34 -shared -o ltop.so -fPIC top.o
The reason is that when you specify a library with -l when compiling a binary, only unresolved external symbols from earlier in the compilation line are picked up. In your case, this is nothing, so precisely nothing is picked up in creating libtop.so from libtop.a
UPDATE: As an alternative, if the original object files are no longer available, youc an use the --whole-archive linker option to force it to include everything from a static library, rather than just unresolved externals:
g++34 -shared -o ltop.so -fPIC -Wl,--whole-archive ./libtop.a
Or:
g++34 -shared -o ltop.so -fPIC -Wl,--whole-archive -L. -ltop

How to expose entrypoints from a library (.a) which is linked into a shared library (.so) [duplicate]

When compiling our project, we create several archives (static libraries), say liby.a and libz.a that each contains an object file defining a function y_function() and z_function(). Then, these archives are joined in a shared object, say libyz.so, that is one of our main distributable target.
g++ -fPIC -c -o y.o y.cpp
ar cr liby.a y.o
g++ -fPIC -c -o z.o z.cpp
ar cr libz.a z.o
g++ -shared -L. -ly -lz -o libyz.so
When using this shared object into the example program, say x.c, the link fails because of an undefined references to functions y_function() and z_function().
g++ x.o -L. -lyz -o xyz
It works however when I link the final executable directly with the archives (static libraries).
g++ x.o -L. -ly -lz -o xyz
My guess is that the object files contained in the archives are not linked into the shared library because they are not used in it. How to force inclusion?
Edit:
Inclusion can be forced using --whole-archive ld option. But if results in compilation errors:
g++ -shared '-Wl,--whole-archive' -L. -ly -lz -o libyz.so
/usr/lib/libc_nonshared.a(elf-init.oS): In function `__libc_csu_init':
(.text+0x1d): undefined reference to `__init_array_end'
/usr/bin/ld: /usr/lib/libc_nonshared.a(elf-init.oS): relocation R_X86_64_PC32 against undefined hidden symbol `__init_array_end' can not be used when making a shared object
/usr/bin/ld: final link failed: Bad value
Any idea where this comes from?
You could try (ld(2)):
--whole-archive
For each archive mentioned on the command line after the --whole-archive option, include every object file in the
archive in the link, rather than searching the archive for the required object files. This is normally used to turn
an archive file into a shared library, forcing every object to be included in the resulting shared library. This
option may be used more than once.
(gcc -Wl,--whole-archive)
Plus, you should put -Wl,--no-whole-archive at the end of the library list. (as said by Dmitry Yudakov in the comment below)

Linking archives (.a) into shared object (.so) [duplicate]

This question already has an answer here:
How to include all objects of an archive in a shared object?
(1 answer)
Closed 10 months ago.
I'm compiling some shared objects file into an archive.a:
$ g++ -c -Iinclude/ -fPIC -O0 -o object1.o source1.cpp
$ g++ -c -Iinclude/ -fPIC -O0 -o object2.o source2.cpp
$ ar rvs archive.a object1.o object2.o
r - object1.o
r - object2.o
So far so good. The resulting archive.a has a good size of some KB. A dump with nm shows that the corresponding object-files are contained within the files.
Now I'm wanting to compile several of these archives into a shared object file.
g++ -g -O0 -Iinclude/ -I/usr/include/somelibrary -shared -o libLibrary.so archive1.a archive2.a
The result is that my resulting library file is nearly empty:
$ nm -D libLibrary.so
w _Jv_RegisterClasses
0000000000201010 A __bss_start
w __cxa_finalize
w __gmon_start__
0000000000201010 A _edata
0000000000201020 A _end
0000000000000578 T _fini
0000000000000430 T _init
Any idea what I'm doing wrong?
Edit:
When I try the switch -Wl,--whole-archive, following happens:
/usr/lib/x86_64-linux-gnu/libc_nonshared.a(elf-init.oS): In function `__libc_csu_init':
(.text+0xd): undefined reference to `__init_array_end'
/usr/bin/ld: /usr/lib/x86_64-linux-gnu/libc_nonshared.a(elf-init.oS): relocation R_X86_64_PC32 against undefined hidden symbol `__init_array_end' can not be used when making a shared object
/usr/bin/ld: final link failed: Bad value
collect2: ld returned 1 exit status
make: *** [libKeynect.so] Error 1
symbols/object files in .a files that's not used, will be discarded by the linker.
Use -Wl,--whole-archive for the linking to include the entire .a file
Edit, you'll need to add -Wl,--no-whole-archive after you specify your library as well, so the whole thing will be -Wl,--whole-archive archive1.a archive2.a -Wl,--no-whole-archive
Regarding your edit: Put "-Wl,--no-whole-archive" at the end of the link command you're running. That fixed it for me.

g++: In what order should static and dynamic libraries be linked?

Let's say we got a main executable called "my_app" and it uses several other libraries: 3 libraries are linked statically, and other 3 are linked dynamically.
In which order should they be linked against "my_app"?
But in which order should these be linked?
Let's say we got libSA (as in Static A) which depends on libSB, and libSC which depends on libSB:
libSA -> libSB -> libSC
and three dynamic libraries:libDA -> libDB -> libDC (libDA is the basic, libDC is the highest)
in which order should these be linked? the basic one first or last?
g++ ... -g libSA libSB libSC -lDA -lDB -lDC -o my_app
seems like the currect order, but is that so? what if there are dependencies between any dynamic library to a static one, or the other way?
In the static case, it doesn't really matter, because you don't actually link static libraries - all you do is pack some object files together in one archive. All you have to is compile your object files, and you can create static libraries right away.
The situation with dynamic libraries is more convoluted, there are two aspects:
A shared library works exactly the same way as static library (except for shared segments, if they are present), which means, you can just do the same - just link your shared library as soon as you have the object files. This means for example symbols from libDA will appear as undefined in libDB
You can specify the libraries to link to on the command line when linking shared objects. This has the same effect as 1., but, marks libDB as needing libDA.
The difference is that if you use the former way, you have to specify all three libraries (-lDA, -lDB, -lDC) on the command line when linking the executable. If you use the latter, you just specify -lDC and it will pull the others automatically at link time. Note that link time is just before your program runs (which means you can get different versions of symbols, even from different libraries).
This all applies to UNIX; Windows DLL work quite differently.
Edit after clarification of the question:
Quote from the ld info manual.
The linker will search an archive only
once, at the location where it is
specified on the command line. If the
archive defines a symbol which was
undefined in some object which
appeared before the archive on the
command line, the linker will include
the appropriate file(s) from the
archive. However, an undefined symbol
in an object appearing later on the
command line will not cause the linker
to search the archive again.
See the `-(' option for a way to force
the linker to search archives multiple
times.
You may list the same archive multiple
times on the command line.
This type of archive searching is
standard for Unix linkers. However, if
you are using `ld' on AIX, note that
it is different from the behaviour of
the AIX linker.
That means:
Any static library or object that depends on other library should be placed before it in the command line. If static libraries depend on each other circularly, you can eg. use the -( command line option, or place the libraries on the command line twice (-lDA -lDB -lDA). The order of dynamic libraries doesn't matter.
This is the sort of question that's best solved by a trivial example. Really! Take 2 minutes, code up a simple example, and try it out! You'll learn something, and it's faster than asking.
For example, given files:
a1.cc
#include <stdio.h>
void a1() { printf("a1\n"); }
a2.cc
#include <stdio.h>
extern void a1();
void a2() { printf("a2\n"); a1(); }
a3.cc
#include <stdio.h>
extern void a2();
void a3() { printf("a3\n"); a2(); }
aa.cc
extern void a3();
int main()
{
a3();
}
Running:
g++ -Wall -g -c a1.cc
g++ -Wall -g -c a2.cc
g++ -Wall -g -c a3.cc
ar -r liba1.a a1.o
ar -r liba2.a a2.o
ar -r liba3.a a3.o
g++ -Wall -g aa.cc -o aa -la1 -la2 -la3 -L.
Shows:
./liba3.a(a3.o)(.text+0x14): In function `a3()':
/tmp/z/a3.C:4: undefined reference to `a2()'
Whereas:
g++ -Wall -g -c a1.C
g++ -Wall -g -c a2.C
g++ -Wall -g -c a3.C
ar -r liba1.a a1.o
ar -r liba2.a a2.o
ar -r liba3.a a3.o
g++ -Wall -g aa.C -o aa -la3 -la2 -la1 -L.
Succeeds. (Just the -la3 -la2 -la1 parameter order is changed.)
PS:
nm --demangle liba*.a
liba1.a:
a1.o:
U __gxx_personality_v0
U printf
0000000000000000 T a1()
liba2.a:
a2.o:
U __gxx_personality_v0
U printf
U a1()
0000000000000000 T a2()
liba3.a:
a3.o:
U __gxx_personality_v0
U printf
U a2()
0000000000000000 T a3()
From man nm:
If lowercase, the symbol is local; if uppercase, the symbol is global (external).
"T" The symbol is in the text (code) section.
"U" The symbol is undefined.
I worked in a project with a bunch of internal libraries that unfortunately depended on each other (and it got worse over time). We ended up "solving" this by setting up SCons to specify all libs twice when linking:
g++ ... -la1 -la2 -la3 -la1 -la2 -la3 ...
The dependencies for linking a library or executable have to be present at link-time, so you cannot link libXC before libXB is present. It doesn't matter if statically or dynamically.
Start with the most basic one, which has no (or just outside of your project) dependencies.
It's good practice to keep libraries independent of each other to avoid link order issues.