Dynamic loaded libraries and shared global symbols - c++

Since I observed some strange behavior of global variables in my dynamically loaded libraries, I wrote the following test.
At first we need a statically linked library: The header test.hpp
#ifndef __BASE_HPP
#define __BASE_HPP
#include <iostream>
class test {
private:
int value;
public:
test(int value) : value(value) {
std::cout << "test::test(int) : value = " << value << std::endl;
}
~test() {
std::cout << "test::~test() : value = " << value << std::endl;
}
int get_value() const { return value; }
void set_value(int new_value) { value = new_value; }
};
extern test global_test;
#endif // __BASE_HPP
and the source test.cpp
#include "base.hpp"
test global_test = test(1);
Then I wrote a dynamically loaded library: library.cpp
#include "base.hpp"
extern "C" {
test* get_global_test() { return &global_test; }
}
and a client program loading this library: client.cpp
#include <iostream>
#include <dlfcn.h>
#include "base.hpp"
typedef test* get_global_test_t();
int main() {
global_test.set_value(2); // global_test from libbase.a
std::cout << "client: " << global_test.get_value() << std::endl;
void* handle = dlopen("./liblibrary.so", RTLD_LAZY);
if (handle == NULL) {
std::cout << dlerror() << std::endl;
return 1;
}
get_global_test_t* get_global_test = NULL;
void* func = dlsym(handle, "get_global_test");
if (func == NULL) {
std::cout << dlerror() << std::endl;
return 1;
} else get_global_test = reinterpret_cast<get_global_test_t*>(func);
test* t = get_global_test(); // global_test from liblibrary.so
std::cout << "liblibrary.so: " << t->get_value() << std::endl;
std::cout << "client: " << global_test.get_value() << std::endl;
dlclose(handle);
return 0;
}
Now I compile the statically loaded library with
g++ -Wall -g -c base.cpp
ar rcs libbase.a base.o
the dynamically loaded library
g++ -Wall -g -fPIC -shared library.cpp libbase.a -o liblibrary.so
and the client
g++ -Wall -g -ldl client.cpp libbase.a -o client
Now I observe: The client and the dynamically loaded library possess a different version of the variable global_test. But in my project I'm using cmake. The build script looks like this:
CMAKE_MINIMUM_REQUIRED(VERSION 2.6)
PROJECT(globaltest)
ADD_LIBRARY(base STATIC base.cpp)
ADD_LIBRARY(library MODULE library.cpp)
TARGET_LINK_LIBRARIES(library base)
ADD_EXECUTABLE(client client.cpp)
TARGET_LINK_LIBRARIES(client base dl)
analyzing the created makefiles I found that cmake builds the client with
g++ -Wall -g -ldl -rdynamic client.cpp libbase.a -o client
This ends up in a slightly different but fatal behavior: The global_test of the client and the dynamically loaded library are the same but will be destroyed two times at the end of the program.
Am I using cmake in a wrong way? Is it possible that the client and the dynamically loaded library use the same global_test but without this double destruction problem?

g++ -Wall -g -ldl -rdynamic client.cpp libbase.a -o client
CMake adds -rdynamic option allowing loaded library to resolve symbols in the loading executable... So you can see that this is what you don't want. Without this option it just misses this symbol by accident.
But... You should not do any stuff like that there. Your libraries and executable should
not share symbols unless they are really should be shared.
Always think of dynamic linking as static linking.

If using shared libraries you must define the stuff you want to export with macro like here. See DLL_PUBLIC macro definition in there.

By default, the linker won't combine a global variable (a 'D') in the base executable with one in a shared library. The base executable is special. There might be an obscure way to do this with one of those obscure control files that ld reads, but I sort of doubt it.
--export-dynamic will cause a.out 'D' symbols to be available to shared libs.
However, consider the process. Step 1: you create a DSO from a .o with a 'U' and a .a with a 'D'. So, the linker incorporates the symbol in the DSO. Step 2, you create the executable with a 'U' in one of the .o files, and 'D' in both a .a and the DSO. It will try to resolve using the left-to-right rule.
Variables, as opposed to functions, pose certain difficulties for the linker across modules in any case. A better practice is to avoid global var references across module boundaries, and use function calls. However, that would still fail for you if you put the same function in both the base executable and a shared lib.

My first question is if there is any particular reason for which you both statically and dynamically (via dlopen) link the same code?
For your problem: -rdynamic will export the symbols from your program and what probably is happening is that dynamic linker resolves all references to your global variable to the first symbol it encounters in symbol tables. Which one is that I don't know.
EDIT: given your purpose I would link your program that way:
g++ -Wall -g -ldl client.cpp -llibrary -L. -o client
You may need to fix the order.

I would advise to use a dlopen(... RTLD_LAZY|RTLD_GLOBAL); to merge global symbol tables.

I would propose to compile any .a static library which you plan to link to a dinamic library, with -fvisibility=hidden parameter, so:
g++ -Wall -fvisibility=hidden -g -c base.cpp

Related

Xcode does not share static variable with shared library

I've been working on a couple established C++ projects that use static variables from a shared library to store parameters. When compiled with g++ or clang++, the static variable is shared (has the same memory location) throughout the entire program. However, when compiled with Xcode, the main function static variable has a different memory location than the shared library static variable. Is there a way to get Xcode to compile/run the code the same as g++ or clang++, while still being able to debug with Xcode?
Please see example below:
main.cpp:
#include <iostream>
#include "Params.hpp"
int main(int argc, const char * argv[]) {
Params param = Params();
param.addParams();
std::vector<int> vi = Params::ParamsObj();
vi.push_back(10);
for(std::vector<int>::iterator it = vi.begin(); it != vi.end(); ++it) {
std::cout << "i = " << *it << std::endl;
}
return 0;
}
Params.hpp:
#ifndef Params_hpp
#define Params_hpp
#include <vector>
class Params{
typedef std::vector<int> ParamVector;
public:
static ParamVector& ParamsObj() {
static ParamVector m;
return m;
}
void addParams();
};
#endif /* Params_hpp */
Params.cpp:
#include "Params.hpp"
void Params::addParams(){
Params::ParamsObj().push_back(5);
}
Makefile:
clang:
clang++ -dynamiclib Params.cpp -o libshared_clang.dylib
clang++ main.cpp -o main_clang ./libshared_clang.dylib
gpp:
g++-mp-4.9 -Wall -shared -fPIC -o libshared_gpp.so Params.cpp
g++-mp-4.9 -Wall -o main_gpp main.cpp ./libshared_gpp.so
Output from both g++ and clang++ is:
i = 5
i = 10
While Xcode only outputs i = 10.
If I don't use a shared library and compile everything into one binary, Xcode will properly output both print statements.
My current solution is to add the project's main function into its own shared library and then create an Xcode specific file which merely calls the main function in the newly created shared library. However, I was hoping for a solution that didn't require changing the underlying project's code.
I'm pretty sure that if you turn on optimalization for gcc/clang (which you did not in your example), they will produce the same behavior as your compilation with XCode (which isn't a compiler, but an IDE).
Your problem is that the ParamsObj() function is inline (defining it in the class body adds an implicit inline keyword to it), allowing the compiler to just "paste" it into the main method instead of calling it.
With dll boundaries, this might result in the allocation of multiple static variables, if the function is used in multiple libraries (in your case, it's used in the dll, and inlined into the main executable).
Refactor the ParamsObj() method into a declaration and a separate definition in the corresponding C++ file, and you'll get the same behavior everywhere, printing both numbers.

dlopen a dynamic library from a static library, when the dynamic library uses symbols of the static one

This question is closely related to dlopen a dynamic library from a static library linux C++, but contains a further complication (and uses C++ instead of C):
I have an application that links against a static library (.a) and that library uses the dlopen function to load dynamic libraries (.so). In addition, the dynamic libraries call functions defined in the static one.
Is there a way to compile this without linking the dynamic libraries against the static one or vice versa?
Here comes what I tried so far, slightly modifying the example from the related question:
app.cpp:
#include "staticlib.hpp"
#include <iostream>
int main()
{
std::cout << "and the magic number is: " << doSomethingDynamicish() << std::endl;
return 0;
}
staticlib.hpp:
#ifndef __STATICLIB_H__
#define __STATICLIB_H__
int doSomethingDynamicish();
int doSomethingBoring();
#endif
staticlib.cpp:
#include "staticlib.hpp"
#include "dlfcn.h"
#include <iostream>
int doSomethingDynamicish()
{
void* handle = dlopen("./libdynlib.so",RTLD_NOW);
if(!handle)
{
std::cout << "could not dlopen: " << dlerror() << std::endl;
return 0;
}
typedef int(*dynamicfnc)();
dynamicfnc func = (dynamicfnc)dlsym(handle,"GetMeANumber");
const char* err = dlerror();
if(err)
{
std::cout << "could not dlsym: " <<err << std::endl;
return 0;
}
return func();
}
staticlib2.cpp:
#include "staticlib.hpp"
#include "dlfcn.h"
#include <iostream>
int doSomethingBoring()
{
std::cout << "This function is so boring." << std::endl;
return 0;
}
dynlib.cpp:
#include "staticlib.hpp"
extern "C" int GetMeANumber()
{
doSomethingBoring();
return 1337;
}
and build:
g++ -c -o staticlib.o staticlib.cpp
g++ -c -o staticlib2.o staticlib2.cpp
ar rv libstaticlib.a staticlib.o staticlib2.o
ranlib libstaticlib.a
g++ -rdynamic -o app app.cpp libstaticlib.a -ldl
g++ -fPIC -shared -o libdynlib.so dynlib.cpp
When I run it with ./app I get
could not dlopen: ./libdynlib.so: undefined symbol: _Z17doSomethingBoringv
and the magic number is: 0
From the dlopen manual page:
If the executable was linked with the flag "-rdynamic" (or, synonymously, "--export-dynamic"), then the global symbols in the executable will also be used to resolve references in a dynamically loaded library.
That means that for the application to export its symbols for use in the dynamic library, you have to link your application with the -rdynamic flag.
Besides the problem described above, there is another problem and that has to do with the static library: The problem is namely that since the doSomethingBoring function is not called in your main program, the object file staticlib2.o from the static library is not linked.
The answer can be found in e.g. this old question, which tells you to add the --whole-archive linker flag:
g++ -rdynamic -o app app.cpp -L. \
-Wl,--whole-archive -lstaticlib \
-Wl,--no-whole-archive -ldl

Multiple instances of singleton across shared libraries on Linux

My question, as the title mentioned, is obvious, and I describe the scenario in details.
There is a class named singleton implemented by singleton pattern as following, in file singleton.h:
/*
* singleton.h
*
* Created on: 2011-12-24
* Author: bourneli
*/
#ifndef SINGLETON_H_
#define SINGLETON_H_
class singleton
{
private:
singleton() {num = -1;}
static singleton* pInstance;
public:
static singleton& instance()
{
if (NULL == pInstance)
{
pInstance = new singleton();
}
return *pInstance;
}
public:
int num;
};
singleton* singleton::pInstance = NULL;
#endif /* SINGLETON_H_ */
then, there is a plugin called hello.cpp as following:
#include <iostream>
#include "singleton.h"
extern "C" void hello() {
std::cout << "singleton.num in hello.so : " << singleton::instance().num << std::endl;
++singleton::instance().num;
std::cout << "singleton.num in hello.so after ++ : " << singleton::instance().num << std::endl;
}
you can see that the plugin call the singleton and change the attribute num in the singleton.
last, there is a main function use the singleton and the plugin as following:
#include <iostream>
#include <dlfcn.h>
#include "singleton.h"
int main() {
using std::cout;
using std::cerr;
using std::endl;
singleton::instance().num = 100; // call singleton
cout << "singleton.num in main : " << singleton::instance().num << endl;// call singleton
// open the library
void* handle = dlopen("./hello.so", RTLD_LAZY);
if (!handle) {
cerr << "Cannot open library: " << dlerror() << '\n';
return 1;
}
// load the symbol
typedef void (*hello_t)();
// reset errors
dlerror();
hello_t hello = (hello_t) dlsym(handle, "hello");
const char *dlsym_error = dlerror();
if (dlsym_error) {
cerr << "Cannot load symbol 'hello': " << dlerror() << '\n';
dlclose(handle);
return 1;
}
hello(); // call plugin function hello
cout << "singleton.num in main : " << singleton::instance().num << endl;// call singleton
dlclose(handle);
}
and the makefile is following:
example1: main.cpp hello.so
$(CXX) $(CXXFLAGS) -o example1 main.cpp -ldl
hello.so: hello.cpp
$(CXX) $(CXXFLAGS) -shared -o hello.so hello.cpp
clean:
rm -f example1 hello.so
.PHONY: clean
so, what is the output?
I thought there is following:
singleton.num in main : 100
singleton.num in hello.so : 100
singleton.num in hello.so after ++ : 101
singleton.num in main : 101
however, the actual output is following:
singleton.num in main : 100
singleton.num in hello.so : -1
singleton.num in hello.so after ++ : 0
singleton.num in main : 100
It proves that there are two instances of the singleton class.
Why?
First, you should generally use -fPIC flag when building shared libraries.
Not using it "works" on 32-bit Linux, but would fail on 64-bit one with an error similar to:
/usr/bin/ld: /tmp/ccUUrz9c.o: relocation R_X86_64_32 against `.rodata' can not be used when making a shared object; recompile with -fPIC
Second, your program will work as you expect after you add -rdynamic to the link line for the main executable:
singleton.num in main : 100
singleton.num in hello.so : 100
singleton.num in hello.so after ++ : 101
singleton.num in main : 101
In order to understand why -rdynamic is required, you need to know about the way dynamic linker resolves symbols, and about the dynamic symbol table.
First, let's look at the dynamic symbol table for hello.so:
$ nm -C -D hello.so | grep singleton
0000000000000b8c W singleton::instance()
0000000000201068 B singleton::pInstance
0000000000000b78 W singleton::singleton()
This tells us that there are two weak function definitions, and one global variable singleton::pInstance that are visible to the dynamic linker.
Now let's look at the static and dynamic symbol table for the original example1 (linked without -rdynamic):
$ nm -C example1 | grep singleton
0000000000400d0f t global constructors keyed to singleton::pInstance
0000000000400d38 W singleton::instance()
00000000006022e0 B singleton::pInstance
0000000000400d24 W singleton::singleton()
$ nm -C -D example1 | grep singleton
$
That's right: even though the singleton::pInstance is present in the executable as a global variable, that symbol is not present in the dynamic symbol table, and therefore "invisible" to the dynamic linker.
Because the dynamic linker "doesn't know" that example1 already contains a definition of singleton::pInstance, it doesn't bind that variable inside hello.so to the existing definition (which is what you really want).
When we add -rdynamic to the link line:
$ nm -C example1-rdynamic | grep singleton
0000000000400fdf t global constructors keyed to singleton::pInstance
0000000000401008 W singleton::instance()
00000000006022e0 B singleton::pInstance
0000000000400ff4 W singleton::singleton()
$ nm -C -D example1-rdynamic | grep singleton
0000000000401008 W singleton::instance()
00000000006022e0 B singleton::pInstance
0000000000400ff4 W singleton::singleton()
Now the definition of singleton::pInstance inside the main executable is visible to the dynamic linker, and so it will "reuse" that definition when loading hello.so:
LD_DEBUG=bindings ./example1-rdynamic |& grep pInstance
31972: binding file ./hello.so [0] to ./example1-rdynamic [0]: normal symbol `_ZN9singleton9pInstanceE'
You have to be careful when using runtime-loaded shared libraries. Such a construction is not strictly part of the C++ standard, and you have to consider carefully what the semantics of such a procedure turn out to be.
First off, what's happening is that the shared library sees its own, separate global variable singleton::pInstance. Why is that? A library that's loaded at runtime is essentially a separate, independent program that just happens to not have an entry point. But everything else is really like a separate program, and the dynamic loader will treat it like that, e.g. initialize global variables etc.
The dynamic loader is a runtime facility that has nothing to do with the static loader. The static loader is part of the C++ standard implementation and resolves all of the main program's symbols before the main program starts. The dynamic loader, on the other hand, only runs after the main program has already started. In particular, all symbols of the main program already have to be resolved! There is simply no way to automatically replace symbols from the main program dynamically. Native programs are not "managed" in any way that allows for systematic relinking. (Maybe something can be hacked, but not in a systematic, portable way.)
So the real question is how to solve the design problem that you're attempting. The solution here is to pass handles to all global variables to the plugin functions. Make your main program define the original (and only) copy of the global variable, and the initialize your library with a pointer to that.
For example, your shared library could look like this. First, add a pointer-to-pointer to the singleton class:
class singleton
{
static singleton * pInstance;
public:
static singleton ** ppinstance;
// ...
};
singleton ** singleton::ppInstance(&singleton::pInstance);
Now use *ppInstance instead of pInstance everywhere.
In the plugin, configure the singleton to the pointer from the main program:
void init(singleton ** p)
{
singleton::ppInsance = p;
}
And the main function, call the plugin intialization:
init_fn init;
hello_fn hello;
*reinterpret_cast<void**>(&init) = dlsym(lib, "init");
*reinterpret_cast<void**>(&hello) = dlsym(lib, "hello");
init(singleton::ppInstance);
hello();
Now the plugin shares the same pointer to the singleton instance as the rest of the program.
I think the simple answer is here:
http://www.yolinux.com/TUTORIALS/LibraryArchives-StaticAndDynamic.html
When you have a static variable, it is stored in the object (.o,.a and/or .so)
If the final object to be executed contains two versions of the object, the behaviour is unexpected, like for instance, calling the Destructor of a Singleton object.
Using the proper design, such as declaring the static member in the main file and using the -rdynamic/fpic and using the "" compiler directives will do the trick part for you.
Example makefile statement:
$ g++ -rdynamic -o appexe $(OBJ) $(LINKFLAGS) -Wl,--whole-archive -L./Singleton/ -lsingleton -Wl,--no-whole-archive $(LIBS)
Hope this works!
Thank you all for your answers!
As a follow-up for Linux, you can also use RTLD_GLOBAL with dlopen(...), per man dlopen (and the examples it has). I've made a variant of the OP's example in this directory: github tree
Example output: output.txt
Quick and dirty:
If you don't want to have to manually link in each symbol to your main, keep the shared objects around. (e.g., if you made *.so objects to import into Python)
You can initially load into the global symbol table, or do a NOLOAD + GLOBAL re-open.
Code:
#if MODE == 1
// Add to static symbol table.
#include "producer.h"
#endif
...
#if MODE == 0 || MODE == 1
handle = dlopen(lib, RTLD_LAZY);
#elif MODE == 2
handle = dlopen(lib, RTLD_LAZY | RTLD_GLOBAL);
#elif MODE == 3
handle = dlopen(lib, RTLD_LAZY);
handle = dlopen(lib, RTLD_LAZY | RTLD_NOLOAD | RTLD_GLOBAL);
#endif
Modes:
Mode 0: Nominal lazy loading (won't work)
Mode 1: Include file to add to static symbol table.
Mode 2: Load initially using RTLD_GLOBAL
Mode 3: Reload using RTLD_NOLOAD | RTLD_GLOBAL

shared object can't find symbols in main binary, C++

I'm experimenting with making a kind of plugin architecture for a program I wrote, and at my first attempt I'm having a problem. Is it possible to access symbols from the main executable from within the shared object? I thought the following would be fine:
testlib.cpp:
void foo();
void bar() __attribute__((constructor));
void bar(){ foo(); }
testexe.cpp:
#include <iostream>
#include <dlfcn.h>
using namespace std;
void foo()
{
cout << "dynamic library loaded" << endl;
}
int main()
{
cout << "attempting to load" << endl;
void* ret = dlopen("./testlib.so", RTLD_LAZY);
if(ret == NULL)
cout << "fail: " << dlerror() << endl;
else
cout << "success" << endl;
return 0;
}
Compiled with:
g++ -fPIC -o testexe testexe.cpp -ldl
g++ --shared -fPIC -o testlib.so testlib.cpp
Output:
attempting to load
fail: ./testlib.so: undefined symbol: _Z3foov
So obviously, it's not fine. So I guess I have two questions:
1) Is there a way to make the shared object find symbols in the executable it's loaded from
2) If not, how do programs that use plugins typically work that they manage to get code in arbitrary shared objects to run inside their programs?
Try:
g++ -fPIC -rdynamic -o testexe testexe.cpp -ldl
Without the -rdynamic (or something equivalent, like -Wl,--export-dynamic), symbols from the application itself will not be available for dynamic linking.
From The Linux Programming Interface:
42.1.6 Accessing Symbols in the Main Program
Suppose that we use dlopen() to dynamically load a shared library, use dlsym() to obtain
the address of a function x() from that library, and then call x(). If
x() in turn calls a function y(), then y() would normally be sought in
one of the shared libraries loaded by the program. Sometimes, it is
desirable instead to have x() invoke an implementation of y() in the
main program. (This is similar to a callback mechanism.) In order to
do this, we must make the (global-scope) symbols in the main program
available to the dynamic linker, by linking the program using the
––export–dynamic linker option:
$ gcc -Wl,--export-dynamic main.c (plus further options and arguments)
Equivalently, we can write the
following:
$ gcc -export-dynamic main.c
Using either of these options
allows a dynamically loaded library to access global symbols in the
main program.
The gcc –rdynamic option and the gcc –Wl,–E option are
further synonyms for –Wl,––export–dynamic.

Static initialization and destruction of a static library's globals not happening with g++

Until some time ago, I thought a .a static library was just a collection of .o object files, just archiving them and not making them handled differently. But linking with a .o object and linking with a .a static library containing this .o object are apparently not the same. And I don't understand why...
Let's consider the following source code files:
// main.cpp
#include <iostream>
int main(int argc, char* argv[]) {
std::cout << "main" << std::endl;
}
// object.hpp
#include <iostream>
struct Object
{
Object() { std::cout << "Object constructor called" << std::endl; }
~Object() { std::cout << "Object destructor called" << std::endl; }
};
// object.cpp
#include "object.hpp"
static Object gObject;
Let's compile and link and run this code:
g++ -Wall object.cpp main.cpp -o main1
./main1
> Object constructor called
> main
> Object destructor called
The constructor an the destructor of the global gObject object is called.
Now let's create a static library from our code and use (link) it in another program:
g++ -Wall -c object.cpp main.cpp
ar rcs lib.a object.o
g++ -Wall -o main2 main.o lib.a
./main2
> main
gObject's constructor and destructor are not called... why?
How to have them automatically called?
Thanks.
.a static libraries contain several .o but they are not linked in unless you reference them from the main app.
.o files standalone link always.
So .o files in the linker always go inside, referenced or not, but from .a files only referenced .o object files are linked.
As a note, static global objects are not required to be initialized till you actually reference anything in the compilation unit, most compilers will initialize all of them before main, but the only requirement is that they get initialized before any function of the compilation unit gets executed.