I have an executable which links to a big .a archive that contains lots of functions. The executable only uses a small fraction of the functions in this archive, but for some reason it pulls everything from it and ends up being very big.
My suspicion is that some of the functionality that the executable is using somehow references something it shouldn't and that causes everything else to be pulled.
Is it possible to make gcc tell me what reference causes a specific symbol to be added in the executable? Why else can this happen?
I've tried using --gc-sections with no effect.
I've tried using --version-script to make all the symbols in the executable local with no effect
I'm not interested in -ffunction-sections and -fdata-sections since it is while object files I want to discard, not functions.
Other answers mention -why_live but that seem to be implemented only for darwin and I am in linux x86_64
Use -Wl,-M to pass -M to the linker, causing it to print a link trace. This will show you the reasons (or at least the first-found reason) for every object file that gets linked from an archive.
Related
I'm using C++ dlopen() to link a shared library named as lib*.so (in directory A) in my main program (in directory B).
I experimented on some simple function loading. Every thing works very well. However, it gave me a headache when I was trying to load class and factory functions that return a pointer to the class object. (I'm using the terms from the tutorial below)
The methodology I used was based on the examples in chapter 3.3 of this tutorial https://www.tldp.org/HOWTO/C++-dlopen/thesolution.html#externC.
There is a bit of polymorphism here ... lib*.so contains a child class that inherits a parent abstract class from the main program directory (directory B). When dlopen() tries to load lib*.so in the main program, it failed due to "undefined symbol".
I used nm command to examine the symbol tables in lib*.so and main program binary. The symbols in these binaries are:
lib*.so : U _ZTI7ParentBox
main program binary: V _ZTI7ParentBox
ParentBox is the name of the parent class inherited by ChildBox in lib*.so. Note that parent class header file is in another project in directory B.
Although there is name mangling the symbol names are exactly the same.
I'm just wondering why the dynamic linker cannot link them? and giving me undefeind symbol error for dlopen()?
Am I missing the understanding of some key concepts here?
P.S. more strangely, it was able to resolve the symbols for member functions between the child class (U type symbol) in lib*.so (T type symbol) and parent class. Why is it able to do this but not able to resolve the undefined symbol for parent class name?
(I've been searching around for a long time and tried -rdynamic, -ldl stuff though I'm not fully understood what they are, but nothing worked)
Update 04 April 2019:
This is the g++ command line I used to make the main program binary.
g++ -fvisibility=hidden -pthread -static-libgcc -static-libstdc++ \
-m64 -fpic -ggdb3 -fno-var-tracking-assignments -std=c++14 \
-rdynamic \
-o ./build/main-prog \
/some_absolute_path/ParentBox.o \
/some_other_pathen/Triangle.o \
/some_other_pathen/Circle.o \
/some_other_pathen/<lots_of_depending_obj> \
/some_absolute_path/librandom.a \
-lz -ldl -lrt -lbz2
I searched every argument of this command line in https://gcc.gnu.org/onlinedocs/gcc/Option-Index.html (This seems to be a good reference site for all fellow programmers working with large projects with complicated g++ line :) )
Thanks to #Employed Russian. With his instructions, the problem narrows down to export the symbols in main program binary.
However, the main program binary has lots of dependencies as you can see from the above command, Circle, Triangle and lots of other object files.
We also need to add "-rdynamic" to the compilation of Circle, Triangle and other dependency object files. Otherwise it does not work.
In my case, I added "-rdynamic" to all files in my project to export all symbols. Not sure about "-fvisibility=hidden" doing anything good. I removed all of them in my Makefile anyway... I know this is not the best way but I will worry about speed later when everything is functionally correct. :)
More Updates:
The correct solution is in #Employed Russian's update in the answer.
My previous solution happened to work because I also removed "-fvisibility=hidden". It is not necessary (and probably wrong) to add -rdynamic to all objects used in the final link.
Please refer to #Employed Russian's explanation which addresses the core issue.
Final Update:
For fellow programmers who are interested in how C/C++ program is executed and how library can be linked, here is a good reference web course (Life of Binary) by Xeno Kovah: http://opensecuritytraining.info/LifeOfBinaries.html
You can also find a playlist on youtube. Just search "Life of Binary"
Although there is name mangling the symbol names are exactly the same. I'm just wondering why the dynamic linker cannot link them?
Most likely explanation: the symbol is not exported from the main binary.
Repeat your command with nm -D:
nm -AD lib*.so main-prog | grep ' _ZTI7ParentBox$'
Chances are, you'll see lib*.so: U _ZTI7ParentBox and nothing from main-prog.
This happens because normally the linker will not export any symbol from main-prog, that is not referenced by some shared library participating in the link (and your lib*.so isn't linked with main-prog, or else you wouldn't need to dlopen it).
To change that behavior, you could add -Wl,--export-dynamic linker flag when linking main-prog. That instructs the linker to export everything that is linked into main-prog.
tried -rdynamic
That is equivalent to -Wl,--export-dynamic, and should have worked (assuming you added it to the main-prog link line, and not somewhere else).
Update:
Everything works now! Since main-prog also depends on some other objects, it appears that simply add -rdynamic to the final main-prog linking does not resolve the problem. We need to add "-rdynamic" to the compilation of those depending objects.
That is the wrong solution. Your problem is that -fvisibility=hidden tells the compiler to mark all symbols that go into main-prog as not exported, and -rdynamic doesn't export any hidden symbols.
The correct solution is to remove -fvisibility=hidden from any objects that define symbols you do want to export, and add -rdynamic to the final link.
I want to identify unused object files in a large C application with many libraries. The project has grown a lot over time and now I want to search for libraries that are not used anymore, so that I can remove them from the dependency file. Is it possible with the gcc linker to identify any object that is not used?
For example, if I compile an application with gcc and let's say none of the symbols/functions of library2 are used. Is there any way to get the info about which objects are not linked in?
gcc library1.o library2.o main.o -o main.elf
I know that gcc has the compiler and linker flags to remove unused symbols:
-fdata-sections -ffunction-sections -Wl,--gc-sections
However this way I don't know which of the objects were removed by gcc. It would be perfect if gcc has an option to get a list of objects which were not linked into the application.
Just to mention: I need it on object file basis not on function/symbol basis!
Does anyone know such an option for gcc?
For example, if I compile an application with gcc and let's say none of the symbols/functions of library2 are used. Is there any way to get the info about which objects are not linked in?
gcc library1.o library2.o main.o -o main.elf
With above command, library2.o will be linked in even if none of the code from it is ever used. To understand why, read this or this.
It is true that if you compile code in library2.o with -ffunction-sections -fdata-sections and link with -Wl,-gc-sections, then all of the code and data from library2.o will be GC'd out, but that is not the command you gave.
Presumably, you are more interested in what happens if you use libraries as libraries:
gcc main.o -o main.elf -lrary1 -lrary2
In that case, if none of the code from library2 is referenced, the linker will not pull it into the link.
There is no way to ask the linker for list of things it didn't use, but (if you are using GNU-ld) there is a way to ask it for a list of objects it did use: the -M or -Map option. Once you know what objects are used, it's a simple matter of subtracting used objects from all objects used while linking to get the list that is not used.
Update:
Gold linker supports --print-symbol-counts FILENAME option, which can also be helpful here. It prints defined and used symbol counts. For library2.a, it should print $num_defined 0, the 0 indicating that none of the objects from library2.a were actually used.
Take a look at callcatcher
This compiles your program into assembly and extracts obvious references from the assembly output. I guess that is exactly what you are searching for. (Note due to the fact it analyzes assembler output it will only work on x86 platforms)
Note callcatcher ignores virtual functions (for some good reasons), so it will not directly allow you to analyse those.
I have read several posts on stack overflow and read about dynamic linking online. And this is what I have taken away from all those readings -
Dynamic linking is an optimization technique that was employed to take full advantage of the virtual memory of the system. One process can share its pages with other processes. For example the libc++ needs to be linked with all C++ programs but instead of copying over the executable to every process, it can be linked dynamically with many processes via shared virtual pages.
However this leads me to the following questions
When a C++ program is compiled. It needs to have references to the C++ library functions and code (say for example the code of the thread library). How does the compiler make the executable have these references? Does this not result in a circular dependency between the compiler and the operating system? Since the compiler has to make a reference to the dynamic library in the executable.
How and when would you use a dynamic library? How do you make one? What is the specific compiling command that is used to produce such a file from a standard *.cpp file?
Usually when I install a library, there is a lib/ directory with *.a files and *.dylib (on mac-OSX) files. How do I know which ones to link to statically as I would with a regular *.o file and which ones are supposed to be dynamically linked with? I am assuming the *.dylib files are dynamic libraries. Which compiler flag would one use to link to these?
What are the -L and -l flags for? What does it mean to specify for example a -lusb flag on the command line?
If you feel like this question is asking too many things at once, please let me know. I would be completely ok with splitting this question up into multiple ones. I just ask them together because I feel like the answer to one question leads to another.
When a C++ program is compiled. It needs to have references to the C++
library functions and code (say for example the code for the library).
Assume we have a hypothetical shared library called libdyno.so. You'll eventually be able to peek inside it using using objdump or nm.
objdump --syms libdyno.so
You can do this today on your system with any shared library. objdump on a MAC is called gobjdump and comes with brew in the binutils package. Try this on a mac...
gobjdump --syms /usr/lib/libz.dylib
You can now see that the symbols are contained in the shared object. When you link with the shared object you typically use something like
g++ -Wall -g -pedantic -ldyno DynoLib_main.cpp -o dyno_main
Note the -ldyno in that command. This is telling the compiler (really the linker ld) to look for a shared object file called libdyno.so wherever it normally looks for them. Once it finds that object it can then find the symbols it needs. There's no circular dependency because you the developer asked for the dynamic library to be loaded by specifying the -l flag.
How and when would you use a dynamic library? How do you make one? As in what
is the specific compiling command that is used to produce such a file from a
standard .cpp file
Create a file called DynoLib.cpp
#include "DynoLib.h"
DynamicLib::DynamicLib() {}
int DynamicLib::square(int a) {
return a * a;
}
Create a file called DynoLib.h
#ifndef DYNOLIB_H
#define DYNOLIB_H
class DynamicLib {
public:
DynamicLib();
int square(int a);
};
#endif
Compile them to be a shared library as follows. This is linux specific...
g++ -Wall -g -pedantic -shared -std=c++11 DynoLib.cpp -o libdyno.so
You can now inspect this object using the command I gave earlier ie
objdump --syms libdyno.so
Now create a file called DynoLib_main.cpp that will be linked with libdyno.so and use the function we just defined in it.
#include "DynoLib.h"
#include <iostream>
using namespace std;
int main(void) {
DynamicLib *lib = new DynamicLib();
std::cout << "Square " << lib->square(1729) << std::endl;
return 1;
}
Compile it as follows
g++ -Wall -g -pedantic -L. -ldyno DynoLib_main.cpp -o dyno_main
./dyno_main
Square 2989441
You can also have a look at the main binary using nm. In the following I'm seeing if there is anything with the string square in it ie is the symbol I need from libdyno.so in any way referenced in my binary.
nm dyno_runner |grep square
U _ZN10DynamicLib6squareEi
The answer is yes. The uppercase U means undefined but this is the symbol name for our square method in the DynamicLib Class that we created earlier. The odd looking name is due to name mangling which is it's own topic.
How do I know which ones to link to statically as I would with a regular
.o file and which ones are supposed to be dynamically linked with?
You don't need to know. You specify what you want to link with and let the compiler (and linker etc) do the work. Note the -l flag names the library and the -L tells it where to look. There's a decent write up on how the compiler finds thing here
gcc Linkage option -L: Alternative ways how to specify the path to the dynamic library
Or have a look at man ld.
What are the -L and -l flags for? What does it mean to specify
for example a -lusb flag on the command line?
See the above link. This is from man ld..
-L searchdir
Add path searchdir to the list of paths that ld will search for
archive libraries and ld control scripts. You may use this option any
number of times. The directories are searched in the order in which
they are specified on the command line. Directories specified on the
command line are searched before the default directories. All -L
options apply to all -l options, regardless of the order in which the
options appear. -L options do not affect how ld searches for a linker
script unless -T option is specified.`
If you managed to get here it pays dividends to learn about the linker ie ld. It plays an important job and is the source of a ton of confusion because most people start out dealing with a compiler and think that compiler == linker and this is not true.
The main difference is that you include static linked libraries with your app. They are linked when you build your app. Dynamic libraries are linked at run time, so you do not need to include them with your app. These days dynamic libraries are used to reduce the size of apps by having many dynamic libraries on everyone's computer.
Dynamic libraries also allow users to update libraries without re-building the client apps. If a bug is found in a library that you use in your app and it is statically linked, you will have to rebuild your app and re-issue it to all your users. If a bug is found in a dynamically linked library, all your users just need to update their libraries and your app does not need an update.
This problem is not specific to Fubi, but a general linker issue. These past few days (read as 5) have been full of linking errors, but I've managed to narrow it down to just a handful.
I'm trying to compile Fubi (Full Body Interaction framework) under the Linux environment. It has only been tested on Windows 7, and the web is lacking resources for compiling on a *nix platform.
Now, like I mentioned above, I had a plethora of linking problems that dealt mostly with incorrect g++ flags. Fubi requires OpenNI and NITE ( as well as OpenCV, if you want ) in order to provide it's basic functionality. I've been able to successfully compile both samples from the OpenNI and NITE frameworks.
As far as I understand, Fubi is a framework, thus I would need to compile a shared library and not a binary file.
When I try to compile it as a binary file using the following command
g++ *.cpp -lglut -lGL -lGLU -lOpenNI -lXnVNite_1_5_2 -I/usr/include/nite -I/usr/include/ni -I/usr/include/GL -I./GestureRecognizer/ -o FubiBin
and I get the output located here. (It's kind of long and I did not want to ruin the format)
If I instead compile into object files (-c flag), no errors appear and it builds the object files successfully. Note, I'm using the following command:
g++ -c *.cpp -lglut -lGL -lGLU -lOpenNI -lXnVNite_1_5_2 -I/usr/include/nite -I/usr/include/ni -I/usr/include/GL -I./GestureRecognizer/
I then am able to use the ar command to generate a statically linked library. No error [probably] occurs (this is only a guess on my end) because it has not run through the linker yet, so those errors won't appear.
Thanks for being patient and reading all of that. Finally, question time:
1) Is the first error regarding the undefined reference to main normal when trying to compile to a binary file? I searched all of the files within that folder and not a single main function exists.
2) The rest of the undefined reference errors complain that they cannot find the functions mentioned. All of these functions are located in .cpp and .h files in the subdirectory GestureRecognizer/ which is a subdirectory of the path I'm compiling in. So wouldn't the parameter -I./GestureRecognizer/ take care of this issue?
I want to be sure that when I do create the shared library that I won't have any linking issues during run-time. Would all of these errors disappear when trying to compile to a binary file if they were initially linked properly?
You are telling the compiler to create an executable in the first invocation and an executable needs a main() function, which it can't find. So no, the error is not normal. In order to create a shared library, use GCC's "-shared" option for that. Trying some test code here, on my system it also wants "-fPIC" when compiling, but that might differ. Best idea is to dissect the compiler and linker command lines of a few other libraries that build correctly on your system.
In order to add the missing symbols from the subdirs, you have to compile those, too: g++ *.cpp ./GestureRecognizer/*.cpp .... The "-I..." only tells the compiler where to search when it finds an #include .... I wouldn't be surprised if this wasn't even necessary, many projects use #include "GestureRecognizer/Foo.h" to achieve that directly.
BTW:
Consider activating warnings when running the compiler ("-W...").
You can split between compiling ("-c") and linking. In both cases, use "g++" though. This should decrease your turnaround time when testing different linker settings.
I'm adding two classes and libraries to a system, parent.so and child.so deriving from it.
The problem is when the program is loading child.so it cannot find parent's virtual function's definition from parent.so.
What happens,
nm -D child.so will gives something like (I just changed the names)
U _ZN12PARENT15virtualFunctionEv
The program will crash running
_handle = dlopen(filename, RTLD_NOW|RTLD_GLOBAL); //filename is child.so
it'll give an error with LD_DEBUG = libs
symbol lookup error: undefined symbol: _ZN12PARENT15virtualFunctionEv (fatal)
The thing I cannot explain is, I tried LD_DEBUG = symbols using GDB, when running dlopen, the log shows it tried to look up basically in all libaries in the system except parent.so, where the symbol is defined. But from libs log parent.so is already loaded and code is run, and it is at the same path of all other libraries.
......
27510: symbol=_ZN12PARENT15virtualFunctionEv; lookup in file=/lib/tls/libm.so.6
27510: symbol=_ZN12PARENT15virtualFunctionEv; lookup in file=/lib/tls/libc.so.6
27510: symbol=_ZN12PARENT15virtualFunctionEv; lookup in file=/lib/ld-linux.so.2
27510: child.so: error: symbol lookup error: undefined symbol: _ZN12PARENT15virtualFunctionEv(fatal)
How the program or system is managing which library to look for a symbol's definition?
I'm new to Linux, can anybody point me some directions to work on?
Thanks.
EDIT
The command used to generate parent.so file is
c++ -shared -o parent.so parent.o
Similar for child.so. Is any information missing for linking here? Looks like child is only including parent's header file.
EDIT2
After another test, calling
_handle = dlopen("parent.so", RTLD_NOW|RTLD_GLOBAL);
before the crashing line will solve the problem, which I think means originally parent.so was not loaded. But I'm still not very clear about the cause.
You need to tell the linker that your library libchild.so uses functionality in libparent.so. You do this when you are creating the child library:
g++ -shared -o libchild.so child_file1.o child_file2.o -Lparent_directory -lparent
Note that order is important. Specify the -lparent after all of your object files. You might also need to pass additional options to the linker via the -Wl option to g++.
That still might not be good enough. You might need to add the library that contains libparent.so to the LD_LIBRARY_PATH environment variable.
A couple of gotchas: If you aren't naming those libraries with a lib prefix you will confuse the linker big time. If you aren't compiling your source files with either -fPIC or -fpic you will not have relocatable objects.
Addendum
There's a big potential problem with libraries that depend on other libraries. Suppose you use version 1.5 of the parent package when your compile your child library source files. You manage to get past all of the library dependencies problems. You've specified that your libchild.so depends on libparent.so. Your stuff just works. That is until version 2.0 of the parent package comes out. Now your stuff breaks everywhere it's used, and you haven't changed one line of code.
The way to overcome this problem is to specify at the time you build your child library that the resultant shared library depends specifically on version 1.5 of libparent.so`.
To do this you will need to pass options from g++/gcc to the linker via the -Wl option. Use -Wl,<linker_option>,<linker_option>,... If those linker options need spaces you'll need to backslash-escape them in the command to g++. A couple of key options are -rpath and -soname. For example, -rpath=/path/to/lib,-soname=libparent.so.1.5.
Note very well: You need to use the -soname=libparent.so.1.5 option when you are building libparent.so. This is what lets the system denote that your libchild.so (version 1.0) depends on libparent.so (version 1.5). And you don't build libparent.so. You build libparent.so.1.5. What about libparent.so? That needs to exist to, but it should be a symbolic link to some numbered numbered version (preferably the most recent version) of libparent.so.
Now suppose non-backward compatible parent version 2.0 is compiled and built into a shiny new libparent.so.2.0 and libparent.so is symbolically linked to this shiny new version. An application that uses your clunky old libchild.so (version 1.0) will happily use the clunky old version of libparent.so instead of the shiny new one that breaks everything.
It looks like you're not telling the linker that child.so needs parent.so, use something like the following:
g++ -shared -o libparent.so parent.o
g++ -shared -o libchild.so -lparent child.o
When you build your main program, you have to tell the compiler that it links with those libraries; that way, when it starts, linux will load them for it.
Change their names to libparent.so and libchild.so.
Then compile with something like this:
g++ <your files and flags> -L<folder where the .so's are> -lparent -lchild
EDIT:
Maybe it would be a smaller change to try loading parent.so before child.so. Did you try that already?