Linker errors when using function from included library - c++

I am only asking this as a last resort, it's been days that I have been trying to fix this linker problem and I have tried so many solutions found at stackoverflow that I have lost count.
Basically, I am trying to use the constuctor from a class included in a shared library. Let's call it LibraryClass(string) and let's call this library liblibrary.so.
Ok, so I am trying to use this constructor in a class called DummyClass and then compiling this one and lots of other DummyClasses and then compiling the objects into a single executable
I am using SCons btw, so it first compiles all the DummyClasses without including any library
g++ -o DummyClass.o -I.... -I.... .... DummyClass.cpp
g++ -o DummyClass2.o -I.... -I.... .... DummyClass2.cpp
Then it compiles into a final executable linking the necessary libraries
g++ -o executable DummyClass.o DummyClass2.o -L/path/to/libs -llibrary -ldependency
Also, the library depends on functions from another library, let's call it libdependency.so
After compiling the executable, it gives me an undefined reference to LibraryClass(string) in DummyClass.o : DummyClass.cpp line ...
It's a big project and there are lots of other libraries involved and lots of other classes being compiled into the 'executable'
First, I've tried to verify that the function is indeed included in the library so i tried to use nm -C liblibrary.so and I can indeed see the function on the output. However if I try to use
nm -CD liblibrary.so I cannot find the function in there (I don't know why but some answers at stack overflow used -CD others used -C)
It doesn't make sense, it should work, first it compiles all the classes, then it compiles the objects with undefined references into a single executable linking with all the required libraries.
What else can I check for?
About the possible duplicate of another question. I believe my question is unique since it is been shown that it might be a problem with the library itself, the question that has been marked as a possible duplicate doesn't mention my probable solution. I do believe my question might help others in the future since I have been looking for answers on stackoverflow for a whole week.

Related

Loading classes - dynamical link library returned undefined symbol error

I'm using C++ dlopen() to link a shared library named as lib*.so (in directory A) in my main program (in directory B).
I experimented on some simple function loading. Every thing works very well. However, it gave me a headache when I was trying to load class and factory functions that return a pointer to the class object. (I'm using the terms from the tutorial below)
The methodology I used was based on the examples in chapter 3.3 of this tutorial https://www.tldp.org/HOWTO/C++-dlopen/thesolution.html#externC.
There is a bit of polymorphism here ... lib*.so contains a child class that inherits a parent abstract class from the main program directory (directory B). When dlopen() tries to load lib*.so in the main program, it failed due to "undefined symbol".
I used nm command to examine the symbol tables in lib*.so and main program binary. The symbols in these binaries are:
lib*.so : U _ZTI7ParentBox
main program binary: V _ZTI7ParentBox
ParentBox is the name of the parent class inherited by ChildBox in lib*.so. Note that parent class header file is in another project in directory B.
Although there is name mangling the symbol names are exactly the same.
I'm just wondering why the dynamic linker cannot link them? and giving me undefeind symbol error for dlopen()?
Am I missing the understanding of some key concepts here?
P.S. more strangely, it was able to resolve the symbols for member functions between the child class (U type symbol) in lib*.so (T type symbol) and parent class. Why is it able to do this but not able to resolve the undefined symbol for parent class name?
(I've been searching around for a long time and tried -rdynamic, -ldl stuff though I'm not fully understood what they are, but nothing worked)
Update 04 April 2019:
This is the g++ command line I used to make the main program binary.
g++ -fvisibility=hidden -pthread -static-libgcc -static-libstdc++ \
-m64 -fpic -ggdb3 -fno-var-tracking-assignments -std=c++14 \
-rdynamic \
-o ./build/main-prog \
/some_absolute_path/ParentBox.o \
/some_other_pathen/Triangle.o \
/some_other_pathen/Circle.o \
/some_other_pathen/<lots_of_depending_obj> \
/some_absolute_path/librandom.a \
-lz -ldl -lrt -lbz2
I searched every argument of this command line in https://gcc.gnu.org/onlinedocs/gcc/Option-Index.html (This seems to be a good reference site for all fellow programmers working with large projects with complicated g++ line :) )
Thanks to #Employed Russian. With his instructions, the problem narrows down to export the symbols in main program binary.
However, the main program binary has lots of dependencies as you can see from the above command, Circle, Triangle and lots of other object files.
We also need to add "-rdynamic" to the compilation of Circle, Triangle and other dependency object files. Otherwise it does not work.
In my case, I added "-rdynamic" to all files in my project to export all symbols. Not sure about "-fvisibility=hidden" doing anything good. I removed all of them in my Makefile anyway... I know this is not the best way but I will worry about speed later when everything is functionally correct. :)
More Updates:
The correct solution is in #Employed Russian's update in the answer.
My previous solution happened to work because I also removed "-fvisibility=hidden". It is not necessary (and probably wrong) to add -rdynamic to all objects used in the final link.
Please refer to #Employed Russian's explanation which addresses the core issue.
Final Update:
For fellow programmers who are interested in how C/C++ program is executed and how library can be linked, here is a good reference web course (Life of Binary) by Xeno Kovah: http://opensecuritytraining.info/LifeOfBinaries.html
You can also find a playlist on youtube. Just search "Life of Binary"
Although there is name mangling the symbol names are exactly the same. I'm just wondering why the dynamic linker cannot link them?
Most likely explanation: the symbol is not exported from the main binary.
Repeat your command with nm -D:
nm -AD lib*.so main-prog | grep ' _ZTI7ParentBox$'
Chances are, you'll see lib*.so: U _ZTI7ParentBox and nothing from main-prog.
This happens because normally the linker will not export any symbol from main-prog, that is not referenced by some shared library participating in the link (and your lib*.so isn't linked with main-prog, or else you wouldn't need to dlopen it).
To change that behavior, you could add -Wl,--export-dynamic linker flag when linking main-prog. That instructs the linker to export everything that is linked into main-prog.
tried -rdynamic
That is equivalent to -Wl,--export-dynamic, and should have worked (assuming you added it to the main-prog link line, and not somewhere else).
Update:
Everything works now! Since main-prog also depends on some other objects, it appears that simply add -rdynamic to the final main-prog linking does not resolve the problem. We need to add "-rdynamic" to the compilation of those depending objects.
That is the wrong solution. Your problem is that -fvisibility=hidden tells the compiler to mark all symbols that go into main-prog as not exported, and -rdynamic doesn't export any hidden symbols.
The correct solution is to remove -fvisibility=hidden from any objects that define symbols you do want to export, and add -rdynamic to the final link.

How dynamic linking works, its usage and how and why you would make a dylib

I have read several posts on stack overflow and read about dynamic linking online. And this is what I have taken away from all those readings -
Dynamic linking is an optimization technique that was employed to take full advantage of the virtual memory of the system. One process can share its pages with other processes. For example the libc++ needs to be linked with all C++ programs but instead of copying over the executable to every process, it can be linked dynamically with many processes via shared virtual pages.
However this leads me to the following questions
When a C++ program is compiled. It needs to have references to the C++ library functions and code (say for example the code of the thread library). How does the compiler make the executable have these references? Does this not result in a circular dependency between the compiler and the operating system? Since the compiler has to make a reference to the dynamic library in the executable.
How and when would you use a dynamic library? How do you make one? What is the specific compiling command that is used to produce such a file from a standard *.cpp file?
Usually when I install a library, there is a lib/ directory with *.a files and *.dylib (on mac-OSX) files. How do I know which ones to link to statically as I would with a regular *.o file and which ones are supposed to be dynamically linked with? I am assuming the *.dylib files are dynamic libraries. Which compiler flag would one use to link to these?
What are the -L and -l flags for? What does it mean to specify for example a -lusb flag on the command line?
If you feel like this question is asking too many things at once, please let me know. I would be completely ok with splitting this question up into multiple ones. I just ask them together because I feel like the answer to one question leads to another.
When a C++ program is compiled. It needs to have references to the C++
library functions and code (say for example the code for the library).
Assume we have a hypothetical shared library called libdyno.so. You'll eventually be able to peek inside it using using objdump or nm.
objdump --syms libdyno.so
You can do this today on your system with any shared library. objdump on a MAC is called gobjdump and comes with brew in the binutils package. Try this on a mac...
gobjdump --syms /usr/lib/libz.dylib
You can now see that the symbols are contained in the shared object. When you link with the shared object you typically use something like
g++ -Wall -g -pedantic -ldyno DynoLib_main.cpp -o dyno_main
Note the -ldyno in that command. This is telling the compiler (really the linker ld) to look for a shared object file called libdyno.so wherever it normally looks for them. Once it finds that object it can then find the symbols it needs. There's no circular dependency because you the developer asked for the dynamic library to be loaded by specifying the -l flag.
How and when would you use a dynamic library? How do you make one? As in what
is the specific compiling command that is used to produce such a file from a
standard .cpp file
Create a file called DynoLib.cpp
#include "DynoLib.h"
DynamicLib::DynamicLib() {}
int DynamicLib::square(int a) {
return a * a;
}
Create a file called DynoLib.h
#ifndef DYNOLIB_H
#define DYNOLIB_H
class DynamicLib {
public:
DynamicLib();
int square(int a);
};
#endif
Compile them to be a shared library as follows. This is linux specific...
g++ -Wall -g -pedantic -shared -std=c++11 DynoLib.cpp -o libdyno.so
You can now inspect this object using the command I gave earlier ie
objdump --syms libdyno.so
Now create a file called DynoLib_main.cpp that will be linked with libdyno.so and use the function we just defined in it.
#include "DynoLib.h"
#include <iostream>
using namespace std;
int main(void) {
DynamicLib *lib = new DynamicLib();
std::cout << "Square " << lib->square(1729) << std::endl;
return 1;
}
Compile it as follows
g++ -Wall -g -pedantic -L. -ldyno DynoLib_main.cpp -o dyno_main
./dyno_main
Square 2989441
You can also have a look at the main binary using nm. In the following I'm seeing if there is anything with the string square in it ie is the symbol I need from libdyno.so in any way referenced in my binary.
nm dyno_runner |grep square
U _ZN10DynamicLib6squareEi
The answer is yes. The uppercase U means undefined but this is the symbol name for our square method in the DynamicLib Class that we created earlier. The odd looking name is due to name mangling which is it's own topic.
How do I know which ones to link to statically as I would with a regular
.o file and which ones are supposed to be dynamically linked with?
You don't need to know. You specify what you want to link with and let the compiler (and linker etc) do the work. Note the -l flag names the library and the -L tells it where to look. There's a decent write up on how the compiler finds thing here
gcc Linkage option -L: Alternative ways how to specify the path to the dynamic library
Or have a look at man ld.
What are the -L and -l flags for? What does it mean to specify
for example a -lusb flag on the command line?
See the above link. This is from man ld..
-L searchdir
Add path searchdir to the list of paths that ld will search for
archive libraries and ld control scripts. You may use this option any
number of times. The directories are searched in the order in which
they are specified on the command line. Directories specified on the
command line are searched before the default directories. All -L
options apply to all -l options, regardless of the order in which the
options appear. -L options do not affect how ld searches for a linker
script unless -T option is specified.`
If you managed to get here it pays dividends to learn about the linker ie ld. It plays an important job and is the source of a ton of confusion because most people start out dealing with a compiler and think that compiler == linker and this is not true.
The main difference is that you include static linked libraries with your app. They are linked when you build your app. Dynamic libraries are linked at run time, so you do not need to include them with your app. These days dynamic libraries are used to reduce the size of apps by having many dynamic libraries on everyone's computer.
Dynamic libraries also allow users to update libraries without re-building the client apps. If a bug is found in a library that you use in your app and it is statically linked, you will have to rebuild your app and re-issue it to all your users. If a bug is found in a dynamically linked library, all your users just need to update their libraries and your app does not need an update.

When look up symbol, the program doesn't search from the correct library

I'm adding two classes and libraries to a system, parent.so and child.so deriving from it.
The problem is when the program is loading child.so it cannot find parent's virtual function's definition from parent.so.
What happens,
nm -D child.so will gives something like (I just changed the names)
U _ZN12PARENT15virtualFunctionEv
The program will crash running
_handle = dlopen(filename, RTLD_NOW|RTLD_GLOBAL); //filename is child.so
it'll give an error with LD_DEBUG = libs
symbol lookup error: undefined symbol: _ZN12PARENT15virtualFunctionEv (fatal)
The thing I cannot explain is, I tried LD_DEBUG = symbols using GDB, when running dlopen, the log shows it tried to look up basically in all libaries in the system except parent.so, where the symbol is defined. But from libs log parent.so is already loaded and code is run, and it is at the same path of all other libraries.
......
27510: symbol=_ZN12PARENT15virtualFunctionEv; lookup in file=/lib/tls/libm.so.6
27510: symbol=_ZN12PARENT15virtualFunctionEv; lookup in file=/lib/tls/libc.so.6
27510: symbol=_ZN12PARENT15virtualFunctionEv; lookup in file=/lib/ld-linux.so.2
27510: child.so: error: symbol lookup error: undefined symbol: _ZN12PARENT15virtualFunctionEv(fatal)
How the program or system is managing which library to look for a symbol's definition?
I'm new to Linux, can anybody point me some directions to work on?
Thanks.
EDIT
The command used to generate parent.so file is
c++ -shared -o parent.so parent.o
Similar for child.so. Is any information missing for linking here? Looks like child is only including parent's header file.
EDIT2
After another test, calling
_handle = dlopen("parent.so", RTLD_NOW|RTLD_GLOBAL);
before the crashing line will solve the problem, which I think means originally parent.so was not loaded. But I'm still not very clear about the cause.
You need to tell the linker that your library libchild.so uses functionality in libparent.so. You do this when you are creating the child library:
g++ -shared -o libchild.so child_file1.o child_file2.o -Lparent_directory -lparent
Note that order is important. Specify the -lparent after all of your object files. You might also need to pass additional options to the linker via the -Wl option to g++.
That still might not be good enough. You might need to add the library that contains libparent.so to the LD_LIBRARY_PATH environment variable.
A couple of gotchas: If you aren't naming those libraries with a lib prefix you will confuse the linker big time. If you aren't compiling your source files with either -fPIC or -fpic you will not have relocatable objects.
Addendum
There's a big potential problem with libraries that depend on other libraries. Suppose you use version 1.5 of the parent package when your compile your child library source files. You manage to get past all of the library dependencies problems. You've specified that your libchild.so depends on libparent.so. Your stuff just works. That is until version 2.0 of the parent package comes out. Now your stuff breaks everywhere it's used, and you haven't changed one line of code.
The way to overcome this problem is to specify at the time you build your child library that the resultant shared library depends specifically on version 1.5 of libparent.so`.
To do this you will need to pass options from g++/gcc to the linker via the -Wl option. Use -Wl,<linker_option>,<linker_option>,... If those linker options need spaces you'll need to backslash-escape them in the command to g++. A couple of key options are -rpath and -soname. For example, -rpath=/path/to/lib,-soname=libparent.so.1.5.
Note very well: You need to use the -soname=libparent.so.1.5 option when you are building libparent.so. This is what lets the system denote that your libchild.so (version 1.0) depends on libparent.so (version 1.5). And you don't build libparent.so. You build libparent.so.1.5. What about libparent.so? That needs to exist to, but it should be a symbolic link to some numbered numbered version (preferably the most recent version) of libparent.so.
Now suppose non-backward compatible parent version 2.0 is compiled and built into a shiny new libparent.so.2.0 and libparent.so is symbolically linked to this shiny new version. An application that uses your clunky old libchild.so (version 1.0) will happily use the clunky old version of libparent.so instead of the shiny new one that breaks everything.
It looks like you're not telling the linker that child.so needs parent.so, use something like the following:
g++ -shared -o libparent.so parent.o
g++ -shared -o libchild.so -lparent child.o
When you build your main program, you have to tell the compiler that it links with those libraries; that way, when it starts, linux will load them for it.
Change their names to libparent.so and libchild.so.
Then compile with something like this:
g++ <your files and flags> -L<folder where the .so's are> -lparent -lchild
EDIT:
Maybe it would be a smaller change to try loading parent.so before child.so. Did you try that already?

g++ relative paths

I'm attempting to design a shared library of shared libraries using g++ with hopes of simplifying my compile scripts and easing my update process in the future, but I'm still novice at best with GNU tools and writing libraries, at that. Can anyone provide advice on whether the following idea is possible with g++?
For convenience, consider the following file system layout:
main.cpp
libraryX/
libraryX/libX.so
libraryX/libraryY/
libraryX/libraryY/libY.so
libraryX/libraryZ/
libraryX/libraryZ/libZ.so
My goal is to be able to link indirectly using cascading relative paths. For instance, main.cpp links to libraryX/libX.so, which links to libraryY/libY.so and libraryZ/libZ.so. Is it possible to only link main.cpp to libX.so and use functions defined in libY.so and libZ.so?
If so, could you provide an example of the flags one would need to do so? I've been trying variations of the following command using various sources from Google to no avail:
g++ -shared -fPIC -Wl-rpath=libraryX -LlibraryX -lX.so main.o -o executable
Any guidance or references are greatly appreciated.
Don't do this (even if you can figure out how).
When you link against -lX, the static linker must know all other shared libraries that are "part of this link". Since -lY is not on the link line, the static linker will either give you an error, or it must somehow figure out where libY.so is coming from. For the latter, it has to replicate the RPATH search that the runtime loader will perform. This replication is error prone (the static linker may not use the exact same algorithm) and best avoided.
Finally, your command line is totally wrong: -shared means you ask the linker for a shared library, but you are clearly trying to link an executable. You generally should not use -fPIC when linking an executable. Also, -Wl-rpath=... should be -Wl,-rpath=... (the comma is important).

g++ linking order dependency when linking c code to c++ code

Prior to today I had always believed that the order that objects and libraries were passed to g++ during the linking stage was unimportant. Then, today, I tried to link from c++ code to c code. I wrapped all the C headers in an extern "C" block but the linker still had difficulties finding symbols which I knew were in the C object archives.
Perplexed, I created a relatively simple example to isolate the linking error but much to my surprise, the simpler example linked without any problems.
After a little trial and error, I found that by emulating the linking pattern used in the simple example, I could get the main code to link OK. The pattern was object code first, object archives second eg:
g++ -o serverCpp serverCpp.o algoC.o libcrypto.a
Can anyone shed some light on why this might be so?. I've never seen this problem when linking ordinary c++ code.
The order you specify object files and libraries is VERY important in GCC - if you haven't been bitten by this before you have lead a charmed life. The linker searches symbols in the order that they appear, so if you have a source file that contains a call to a library function, you need to put it before the library, or the linker won't know that it has to resolve it. Complex use of libraries can mean that you have to specify the library more than once, which is a royal pain to get right.
The library order pass to gcc/g++ does actually matter. If A depends on B, A must be listed first. The reason is that it optimizes out symbols that aren't referenced, so if it sees library B first, and no one has referenced it at that point then it won't link in anything from it at all.
A static library is a collection of object files grouped into an archive. When linking against it, the linker only chooses the objects it needs to resolve any currently undefined symbols. Since the objects are linked in order given on the command line, objects from the library will only be included if the library comes after all the objects that depend on it.
So the link order is very important; if you're going to use static libraries, then you need to be careful to keep track of dependencies, and don't introduce cyclic dependencies between libraries.
You can use --start-group archives --end-group
and write the 2 dependent libraries instead of archives:
gcc main.o -L. -Wl,--start-group -lobj_A -lobj_b -Wl,--end-group