Dynamic loading of shared library with RTLD_DEEPBIND - c++

Background:
I have application which part is used as library for other independent application. They link to that library (lets say lib.so) in linking time. Problem with such approach is that we have to use same external libraries like boost, ace etc or we will have duplicated symbols which in the end will cause crash. We want to resolve this problem.
I know two techniques - one is hiding all symbols (not sure about orders of scopes global/local for shared library) and other is to use dynamic linking. We chose 2nd options (dynamic linking) as it gave client opportunity to do easy testing with stubbed lib.so. and we have very simple api.
I wrote below small example of application which load example shared library and it crash (I want to understand why it crashed and how it should be written).
Crash is in dlopen, exactly in initialization of global variable on assignment to std::string (constructor of Aclass type). From our testing it looks that any access to std library while ongoing initialization of library will result in crash.
We managed to remove crash by adding -fPIC flag to EXECUTABLE (why this resolved our problem, I thought it should be set for shared library, may anyone explain me that more precisely)? Unnecessary to my understanding this flag is problematic as it slow down application and in my case (low latency applications) it is quite problematic.
To summary:
1. Why this crash occur?
2. Why -fPIC flag is enough to resolve this crash?
3. Why it is enough to set -fPIC flag to executable?
4. Is it possible to resolve my problem in other way so shared library and client application could use different versions of libraries (like boost, ace etc, compiler, linux version and std libraries are guarantee to be same)?
5. Removing flag RTLD_DEEPBIND will fix crash too but from gcc man it looks that I should use this flag as it will change symbol scope order for shared library - first it will search symbols in local scope then in global - looks as must have for me as shared library will use different external libraries than executable (and dynamic loading will protect executable with polluting its symbol scope). Why removing this flag fix crash in this simple case?
Shared library dynLib.cpp:
#include <string>
class Aclass
{
std::string s;
s = "123";
}
Aclass a;
Exacutable main.cpp:
#include <stdlib.h>
#include <dlfcn.h>
#include <string>
#include <unistd.h>
#include <iostream>
int main()
{
std::string dummyCrasher;
dlerror();
void* handle = dlopen("./libdynLib.so", RTLD_LAZY | RTLD_LOCAL | RTLD_DEEPBIND);
if(!handle)
{
std::cout << "handle is null" << dlerror();
}
usleep(1000 * 1000 * 10);
}
Makefile: makefile
CXXFLAGS=-m32 -march=x86-64 -Wl,v -g -O3 -Wformat -Werror=format -c
CLINKFLAGS=-Wl,-Bstatic -Wl,Bdynamic -ldl -m32 -march=x86-64
all: dynLib.so dynamiclinking
dynLib.so: dynLib.o
g++44 $(CLINKFLAGS) -shared -o libdynLib.so dynLib.o
dynLib.o: dynLib.cpp
g++44 $(CXXFLAGS) dynLib.cpp
dynamiclinking: main.o
g++44 $(CLINKFLAGS) -o dynamiclinking main.o -ldl
main.o: main.cpp
g++44 (CXXFLAGS) main.cpp
.PHONY: clean
clean:
rm dynLib.o main.o dynamiclinking libdynLib.so
PS. I write that code by hand (could did some spell errors)
PS 2. with -fPIC flag it will work:
main.o: main.cpp
g++44 (CXXFLAGS) main.cpp -fPIC
UPDATE
It is possible to resolve this problem by static linkage of libstdc++. But still my questions are not answered :( Maybe someone have some time to look at it?
UPDATE2 Same problem occur on GCC 4.4.6 and 4.8.1.

I think you are facing the same problem as in When we are supposed to use RTLD_DEEPBIND?, where the executable gets a copy of global variables:
well that's a wonderful feature of you building the main application without the -fPIC option.
[...]
This means that when the symbol is found in libdep.so, it gets copied into the initial data segment of the main executable at that address. Then the reference to duplicate in libdep.so is looked up and it points to the copy of the symbol that's in the main executable.
Due to RTLD_DEEPBIND, dynLib.so is seeing the wrong set of global variables from original libstdc++ when initializing std::string and thus crashes.
As for why the linker has such behavior, this article has a detailed explanation (emphasis mine):
Recall that the program/executable is not relocatable, and thus its data addresses have to bound at link time. Therefore, the linker has to create a copy of the variable in the program's address space, and the dynamic loader will use that as the relocation address. This is similar to the discussion in the previous section - in a sense, myglob in the main program overrides the one in the shared library, and according to the global symbol lookup rules, it's being used instead.
One final note: this behavior is platform-specific, at least on PowerPC there is no such additional copy of global variables in main executable.

Related

NVCC attempting to link unnecessary objects

I have a project that I'm working on making run with CUDA. For various reasons, it needs to compile an executable either with or without GTK support, without recompiling all of the associated files. Under C, I accomplished this by compiling a base version of the objects to *.o and a GTK version of the objects to *.gtk.o. Thus, I can link to that library and if it needs to use GTK it will pull in those functions (and their requirements); if it doesn't it won't touch those objects.
Converting to nvcc has caused some issues: it works in either always or never GTK mode; but if I compile the libraries with the additional GTK objects, it refuses to ignore them and link a GTKless executable. (It fails with errors about being unable to find the cairo functions I call.)
I'm guessing that nvcc is linking to (at least one of) its helper functions embedded in the object, which is causing the linker to resolve the entire object.
Running ar d <lib> <objects.gtk.o> to manually strip them from the library will "fix" the problem, so there isn't a real dependency there.
I'm compiling/linking with
/usr/local/cuda/bin/nvcc --compiler-options -Wall --compiler-options -pipe
-rdc=true -O0 -g -G -I inc -I inc/ext -arch compute_20 -o program
program.cu obs/external.o libs/base.a libs/extra.a libs/core.a -lm
How can I get nvcc to ignore the unneeded objects?
How can I get nvcc to ignore the unneeded objects?
Before you can achieve that, you need to understand which symbol is causing the *.gtk.o objects to be pulled in from the library when they shouldn't be.
The way to do that is to run link with -Wl,--print-map, and look for linker messages such as:
Archive member included because of file (symbol)
libfoo.a(foo.o) main.o (foo)
Above, main.o referenced foo, which is defined in libfoo.a(foo.o), which caused foo.o to be pulled in into the main binary.
Once you know which symbols cause xxxx.gtk.o to be pulled into the link, searching the web and/or NVidia documentation may reveal a way to get rid of them.

Memory still reachable bug fixed, but why?

I'm experimenting with shared libraries to build a modularized program.
There are two cpp files to compile:
Shared library, compile with
g++ -fPIC -shared module.cpp -o module.so
//module.cpp
#include <iostream>
File using the shared library, compile with
g++ src/main.cpp -ldl -o binary
or
g++ -DFIX src/main.cpp -ldl -o binary
//main.cpp
#include <dlfcn.h>
#ifdef FIX
# include <iostream>
#endif
int main()
{
void* h = dlopen("./module.so", RTLD_LAZY);
if ( h )
{
dlclose(h);
}
}
With FIX undefined, valgrind reports lot's of still reachable memory (5,373bytes), with FIX defined, no memory is leaked.
What's the problem with using iostream in shared libraries?
This problem occurs with g++-4.6, g++-4.7 and g++-4.8. g++-4.4 does not show this behaviour. I don't have other compilers to test with, sadly (and I don't want to switch to g++-4.4 because of this).
Update:
Compiling the shared library file with the additional flags -static-libstdc++ -static-libgcc reduces the number of leaked blocks, but not completely. -static-libgcc alone has no effect, -static-libstdc++ has some effect, but not as much as both.
1. Why or how does this code snippet "fix" the issue?
I'm not sure why without digging into the libstdc++ code, but I assume that the memory allocated by the iostreams library, which is kept allocated for the duration of the whole program, is reported as a problem by valgrind when it is allocated in a shared library, but not when allocated in the main program.
2. What equivalent, independent (from the standard library) code snippet provides the same bug fix?
Firstly, I don't know why you want something "independent from the standard library" when the still reachable memory is probably being allocated by the standard library. The fix is either don't use the standard library at all, anywhere, or use it differently.
Secondly, that "fix" is undefined behaviour, because you violate the One Definition Rule by redefining std::ios_base differently to the proper definition in the std lib.
The correct way to get the same behaviour is to #include <iostream> in your main.cpp file, including <iostream> defines a static std::ios_base::Init object. Alternatively, just #include <ios> and then define the static variable (but don't redefine the std::ios_base type) but that's basically what <iostream> does, so you might as well use that.

Link dynamic shared library in Linux - Undefined reference to function

I know there are many questions related to shared libraries on Linux but maybe because I'm tired of having a hard day trying to create a simple dynamic library on Linux (on Windows it would have taken less than 10 minutes) I can't find what happens in this case.
So, I am trying to create a library to be linked at build-time and used at run-time (not a static library, not a library to be embedded into the executable, in other words). For now it contains a simple function. These are my files:
1.
// gugulibrary.cpp
// This is where my function is doing its job
#include "gugulibrary.h"
namespace GuGu {
void SayHello() {
puts("Hello!");
}
}
2.
// gugulibrary.h
// This is where I declare my shared functions
#include <stdio.h>
namespace Gugu {
void SayHello();
}
3.
// guguapp.cpp
// This is the executable using the library
#include "gugulibrary.h"
int main() {
GuGu::SayHello();
return 0;
}
This is how I try to build my project (and I think this is what is wrong):
gcc -Wall -s -O2 -fPIC -c gugulibrary.cpp -o gugulibrary.o
ld -shared -o bin/libGugu.so gugulibrary.o
gcc -Wall -s -O2 guguapp.cpp -o bin/GuGu -ldl
export LD_LIBRARY_PATH=bin
This is saved as a .sh file which I click and execute in a terminal. The error I get when trying to link the library is this:
/tmp/ccG05CQD.o: In function `main':
guguapp.cpp:(.text.startup+0x7): undefined reference to `SayHello'
collect2: ld returned 1 exit status
And this is where I am lost. I want the library to sit in the same folder as the executable for now and maybe I need some symbols/definitions file or something, which I don't know how to create.
Thanks for your help!
In your C++ file, GuGu::SayHello is declared as a C++ symbol. In your header, you are wrapping it in an extern "C" block. This is actually undefined, as you aren't allowed to use C++ syntax (namespace) in that context. But my guess is that what the compiler is doing is ignoring the namespace and generating a C symbol name of "SayHello". Obviously such a function was never defined by your library. Take out the extern "C" bits, because your API as defined cannot be used from C anyway.
You are inconsistent with your GuGu, there are also Gugu's running around, this needs to be made consistent, then it works (At least on my computer are some Gugu's now)

How to make 64 shared 64-bit linux compatible library (*.so), for C++ code

My requirement is to work on some interface .h files. Right now I have .h and .cpp/.cc files in my project.
I need to compile it into shared 64-bit linux compatible library (*.so), using NetBeans/ Eclipse on Linux Fedora.
Since the GCC C++ ABI conventions did slightly change (in particular because of C++ standard libraries evolution, or name mangling convention) from one GCC version to the next (e.g. from g++-4.4 to g++-4.6) your shared library may be dependent upon the version of g++ used to build it
(In practice, the changes are often small inside g++, so you might be non affected)
If you want a symbol to be publicly accessible with dlsym you should preferably declare it extern "C" in your header files (otherwise you should mangle its name).
Regarding how to make a shared library, read documentation like Program Library Howto.
See also this question
And I suggest building your shared libraries with ordinary command-line tools (eg Makefile-s). Don't depend upon a complex IDE like NetBeans/ Eclipse to build them (they are invoking command-line utilities anyway).
If you are compiling a library from the 3 C++ source files called a.cc, b.cc, and c.cc respectively;
g++ -fpic -Wall -c a.cc
g++ -fpic -Wall -c b.cc
g++ -fpic -Wall -c c.cc
g++ -shared -Wl,-soname,libmylib.so.0 -o libmylib.so.0.0.0 a.o b.o c.o
Then you install the library using ldconfig, see man 8 ldconfig
you can then compile the program that uses the libary as follows (but be sure to prefix extern "C" before the class declarations in the header files included in the source code using the library.)
g++ -o myprog main.cc -lmylib
I have tried these compile options with my own sample code, and have been successful.
Basically What is covered in Shared Libraries applies to C++, just replace gcc with g++.
The theory behind all of this is;
Libraries are loaded dynamically when the program is first loaded, as can be confirmed by doing a system call trace on a running program, e.g. strace -o trace.txt ls which will dump a list of the system calls that the program made during execution into a file called trace.txt. At the top of the file you will see that the program (in this case ls) had indeed mmapped all the library's into memory.
Since libraries are loaded dynamically, it is unknown at link time where the library code will exist in the program's virtual address space during run time. Therefore library code must be compiled using position independent code - Hence the -fpic option which tells the translation stage to generate assembly code that has been coded with position independent code in mind. If you tell gcc/g++ to stop after the translation stage, with the -S (upper case S) option, and then look at resulting '.s' file, once with the -fpic option, and once without, you will see the difference (i.e. the dynamic code has #GOTPCREL and #PLT, at least on x86_64).
The linker, of course must be told to link all the ELF relocatatable object types into executable code suitable for use as a Linux shared library.

What does static linking against a library actually do?

Say I had a library called libfoo which contained a class, a few static variables, possibly something with 'C' linkage, and a few other functions.
Now I have a main program which looks like this:
int main() {
return 5+5;
}
When I compile and link this, I link against libfoo.
Will this have any effect? Will my executable increase in size? If so, why? Do the static variables or their addresses get copied into my executable?
Apologies if there is a similar question to this or if I'm being particularly stupid in any way.
It won't do anything in a modern linker, because it knows the executable doesn't actually use libfoo's symbols. With gcc 4.4.1 and ld 2.20 on my system:
g++ linker_test.cpp -static -liberty -lm -lz -lXp -lXpm -o linker_test_unnecessary
g++ linker_test.cpp -static -o linker_test_none
ls -l linker_test_unnecessary linker_test_none
They are both 626094 bytes. Note this also applies to dynamic linking, though the size they both are is much lower.
A library contains previously compiled object code - basically a static library is an archive of .o or .obj files.
The linker looks at your object code and sees if there are any unresolved names and if so looks for these in the library, if it finds them it includes the object file that contains them and repeats this.
Thus only the parts of the static library that are needed are included in your executable.
Thus in your case nothing from libfoo will be added to you executable