Xcode does not share static variable with shared library - c++

I've been working on a couple established C++ projects that use static variables from a shared library to store parameters. When compiled with g++ or clang++, the static variable is shared (has the same memory location) throughout the entire program. However, when compiled with Xcode, the main function static variable has a different memory location than the shared library static variable. Is there a way to get Xcode to compile/run the code the same as g++ or clang++, while still being able to debug with Xcode?
Please see example below:
main.cpp:
#include <iostream>
#include "Params.hpp"
int main(int argc, const char * argv[]) {
Params param = Params();
param.addParams();
std::vector<int> vi = Params::ParamsObj();
vi.push_back(10);
for(std::vector<int>::iterator it = vi.begin(); it != vi.end(); ++it) {
std::cout << "i = " << *it << std::endl;
}
return 0;
}
Params.hpp:
#ifndef Params_hpp
#define Params_hpp
#include <vector>
class Params{
typedef std::vector<int> ParamVector;
public:
static ParamVector& ParamsObj() {
static ParamVector m;
return m;
}
void addParams();
};
#endif /* Params_hpp */
Params.cpp:
#include "Params.hpp"
void Params::addParams(){
Params::ParamsObj().push_back(5);
}
Makefile:
clang:
clang++ -dynamiclib Params.cpp -o libshared_clang.dylib
clang++ main.cpp -o main_clang ./libshared_clang.dylib
gpp:
g++-mp-4.9 -Wall -shared -fPIC -o libshared_gpp.so Params.cpp
g++-mp-4.9 -Wall -o main_gpp main.cpp ./libshared_gpp.so
Output from both g++ and clang++ is:
i = 5
i = 10
While Xcode only outputs i = 10.
If I don't use a shared library and compile everything into one binary, Xcode will properly output both print statements.
My current solution is to add the project's main function into its own shared library and then create an Xcode specific file which merely calls the main function in the newly created shared library. However, I was hoping for a solution that didn't require changing the underlying project's code.

I'm pretty sure that if you turn on optimalization for gcc/clang (which you did not in your example), they will produce the same behavior as your compilation with XCode (which isn't a compiler, but an IDE).
Your problem is that the ParamsObj() function is inline (defining it in the class body adds an implicit inline keyword to it), allowing the compiler to just "paste" it into the main method instead of calling it.
With dll boundaries, this might result in the allocation of multiple static variables, if the function is used in multiple libraries (in your case, it's used in the dll, and inlined into the main executable).
Refactor the ParamsObj() method into a declaration and a separate definition in the corresponding C++ file, and you'll get the same behavior everywhere, printing both numbers.

Related

How is the order of shared library constructor/destructor and global object constructor/destructor specified?

I have a C++ program that has a global variable. The constructor and destructor of this global variable call functions from a shared library that has a constructor and destructor defined.
I am obeserving differing orderings of function calls depending on whether I run this program on Linux or MacOS.
When running the program, I would have expected the following calling sequence:
shared library constructor
shared library function called from global object constructor
...
shared library function called from global object destructor
shared library destructor
And indeed, this is what I get on Linux. However, when running on MacOS, the two last lines are swapped. That is, the shared library destructor is called before the global object is destroyed. As a consequence, the destructor of the global object calls a function in the shared library that was already destroyed.
Here is example code to reproduce the differing behavior:
Shared library implementation (in C): shlib.c
#include <stdio.h>
static int initialized = 0; /* Is library initialized? */
/** Shared library constructor. */
__attribute__((constructor)) void shlib_init(void)
{
printf("Constructor\n");
fflush(stdout);
initialized = 1;
}
/** Shared library destructor. */
__attribute__((destructor)) void shlib_fini(void)
{
printf("Destructor[%d]\n", initialized);
fflush(stdout);
initialized = 0;
}
/** Shared library function that creates something. */
void shlib_create_function(void)
{
printf("create function[%d]\n", initialized);
fflush(stdout);
}
/** Undo shlib_create_function(). */
void shlib_destroy_function(void)
{
printf("destroy function[%d]\n", initialized);
fflush(stdout);
}
Shared library header:
#ifndef SHLIB_H
#define SHLIB_H 1
#ifdef __cplusplus
extern "C" {
#endif
void shlib_create_function(void);
void shlib_destroy_function(void);
#ifdef __cplusplus
}
#endif
#endif /* !SHLIB_H */
Main program (in C++) app.cpp:
#include "shlib.h"
#include <iostream>
class Object {
public:
Object() { shlib_create_function(); }
~Object() { shlib_destroy_function(); }
};
Object global_object;
int
main(void)
{
std::cout << "BEGIN main()" << std::endl << std::flush;
std::cout << "END main()" << std::endl << std::flush;
return 0;
}
Makefile
# Select suffix for shared library (.so on Linux, .dylib on Mac)
ifneq ($(linux),)
so = so
else
so = dylib
endif
# Default compiler is clang.
ifeq ($(COMPILER),)
COMPILER = clang
endif
# Setup compiler.
ifeq ($(COMPILER),gcc)
CC = gcc
CXX = g++
DYLIB = libshlib.$(so)
DYLD = gcc -shared
endif
ifeq ($(COMPILER),icc)
CC = icc
CXX = icc
DYLIB = libshlib.$(so)
DYLD = icc -shared
endif
ifeq ($(COMPILER),clang)
CC = clang
CXX = clang++
DYLIB = libshlib.$(so)
DYLD = clang -shared
endif
.PHONY: clean
app: app.cpp shlib.h $(DYLIB)
$(CXX) -o app app.cpp -L. -lshlib
shlib.o: shlib.c shlib.h
$(CC) -c -fPIC -o shlib.o shlib.c
$(DYLIB): shlib.o
$(DYLD) -o $(DYLIB) shlib.o
clean:
rm -f $(DYLIB) shlib.o app
Running on Ubuntu 20.04.02 LTS with gcc 9.3.0 I get this output (which is what I would expect):
Constructor
create function[1]
BEGIN main()
END main()
destroy function[1]
Destructor[1]
Running instead on macOS 10.13.6 (17G14042) (Kernel Version: Darwin 17.7.0) with clang (LLVM version 9.1.0 (clang-902.0.39.2)) I get this unexpected output instead:
Constructor
create function[1]
BEGIN main()
END main()
Destructor[1]
destroy function[0]
As can be seen, the global variable destructor invokes a function on the shared library that has already been destroyed.
So how is the order of these destructors (shared library and global variable) defined? Why are they not executed in opposite order of the respective constructors on MacOS? Is there a way I can force the shared library destructor to run after global object destructors?
I tried using __attribute__((destructor(65535))) to give the shared library destructor the smallest possible priority, but that did not help.
The best solution would of course be to get rid of the global variable. However, I am dealing with legacy code for which this is currently not an option.
Edit: Thread safety is not an issue here. Once I have a reliable ordering of constructors/destructors, I can take care of thread-safety without problem.
Edit: I just tried on a machine that has MacOS version 10.15.7 and there the order of function calls is the same as for Linux. So this may be a problem that depends on the MacOS version.

How can I force inclusion of a symbol from a static library using gcc or clang in code?

I have a global in a static library which registers and deregisters from a registrar in its ctor/dtor, however without passing -u symbol or --whole-archive it never gets called.
With msvc you can force inclusion of a symbol using #pragma comment(linker, "/include:__mySymbol") in code to fix this. Is there anything I can do, in code, in gcc and/or clang to do the same thing?
It's very easy to reproduce this:
Executable:
// main.cpp
int main() {}
Static Library depended on by executable:
// test.cpp
#include <iostream>
struct Test {
Test() { std::cout << "Test()\n"; }
~Test() { std::cout << "~Test()\n"; }
};
Test test;
If the static library is a shared library, the program prints. If the static library uses --whole-archive, the program prints. But I want to control this in code, not with suboptimal compile flags (or for -u symbol: compile flags dependant on symbol names in code).

Singleton across compilation units: linking library vs linking objects

I apologize if the title is not fully self-explanatory. I'm trying to understand why my singleton factory pattern is not working properly, and I ran into a bizarre difference when using library vs linking single objects files.
Here's a simplified version of the code:
main.cpp
#include <iostream>
#include "bar.hpp"
int main (int /*argc*/, char** /*argv*/)
{
A::get().print();
return 0;
}
bar.hpp
#ifndef BAR_HPP
#define BAR_HPP
#include <iostream>
class A
{
public:
static A& get ()
{
static A a;
return a;
}
bool set(const int i)
{
m_i = i;
print();
return true;
}
void print ()
{
std::cout << "print: " << m_i << "\n";
}
private:
int m_i;
A () : m_i(0) {}
};
#endif // BAR_HPP
baz.hpp
#ifndef BAZ_HPP
#define BAZ_HPP
#include "bar.hpp"
namespace
{
static bool check = A::get().set(2);
}
#endif // BAZ_HPP
baz.cpp
#include "baz.hpp"
Now, I build my "project" in two ways:
Makefile:
all:
g++ -std=c++11 -c baz.cpp
g++ -std=c++11 -o test main.cpp baz.o
lib:
g++ -std=c++11 -c baz.cpp
ar rvs mylib.a baz.o
g++ -std=c++11 -o test main.cpp mylib.a
Here are the outputs I get:
$ make all
$ ./test
print: 2
print: 2
$ make lib
$ ./test
print: 0
In the first case the call to A::get().set(2) in baz.hpp takes place, and the same instantiation of A is then used in the main function, which therefore prints 2. In the second case, the call to A::get().set(2) in baz.hpp never takes place, and in the main function the value set by the constructor (that is, 0) is printed.
So finally I can ask my question: why is the behavior different in the two cases? I would expect that either both print 0 once or print 2 twice. I always assumed that a library was just a compact way to ship object files, and that the behavior of linking mylib.a should be the same as that of linking baz.o directly. Why isn't that the case?
Edit: the reason, as explained by Richard, is that no symbols defined in baz.cpp are required in main.cpp, so baz.o is not extracted from the library and linked. This raises another question: is there a workaround to ensure that the instruction A::get().set(2) is executed? I would like to avoid making the singleton a global object, but I'm not sure it's possible. I would also like to avoid to include baz.hpp in the main, since there may be many bazxyz.hpp and that would require main.cpp to know in advance all of them, defying the whole purpose of the factory-like registration process...
If this is to be a static library, then some module somewhere is going to have to address something in each implementation file of the objects that are going to register themselves with the factory.
A reasonable place for this would be in bar.cpp (which is a file you don't yet have). It would contain some or all of the implementation of A plus some means of calling the registration functions the widgets you're going to create.
Self-discovery only works if the object files are linked into the executable. This gives the c++ startup sequence a chance to know about and construct all objects with global linkage.

Is there any linker flag that tells to defer the loading of dynamic library after dlopen [duplicate]

This question already has answers here:
C++ static initialization order
(6 answers)
Closed 6 years ago.
I have following code
File hello.cc
static A dummyl;
A:: A() {
fun();
}
void A::fun() {
int y = 10;
int z = 20;
int x = y + z;
}
File hello.h
class A {
public:
A a;
void fun();
};
File main.cc
#include <dlfcn.h>
#include "hello.h"
typedef void (*pf)();
int main() {
void *lib;
pf greet;
const char * err;
printf("\n Before dlopen\n");
lib = dlopen("libhello.so", RTLD_NOW | RTLD_GLOBAL);
if (!lib) {
exit(1);
}
A *a = new A ;
a->fun();
dlerror(); /*first clear any previous error; redundant in this case but a useful habit*/
dlclose(lib);
return 0;
}
Build phases:
g++ -fPIC -c hello.cc
g++ -shared -o libhello.so hello.o
g++ -o myprog main.cc -ldl -L. -lhello
Since my shared library in real case is libQtCore.so , I need to link it as using -lQtCore in linker because I cannot use the symbols directly and since there are many of functions then libQtCore, it will not practically advisable to use dlysym for each function of libQtCore.so
Since I link, my static global variables gets initialized before dlopen. Is there any flag for linker g++ that tells compiler to load the shared library only after _dlopen _?
Is there any flag for linker g++ that tells compiler to load the shared library only after _dlopen _?
Yes. On Windows. It's known as delay loading. It doesn't exist on Linux/OS X. It's possible to implement it, but it'd require modifications to binutils and ld.so. See this article for some background on that.
But you don't need to care about any of it. Really.
You are facing a well known problem. So well known that it even has a name: The Static Initialization Order Fiasco.
The solution to all your woes is trivial: don't use static global variables the way you do.
To fix it, use Q_GLOBAL_STATIC, or reimplement something like it yourself. That way the value will be constructed at the time it's first used, not prematurely. That's all there's to it.
Side note: Your recent questions suffer badly by being an XY Problem. You're trying to come up with all sorts of Rube Goldberg-esque solutions to a rather simple issue that it took you a week+ to finally divulge. Ask not about a potential solution, but about the underlying issue is you're attempting to solve. All the library loading stuff is completely and utterly unrelated to your problem, and you don't need to concern yourself with it at all.

Dynamic loaded libraries and shared global symbols

Since I observed some strange behavior of global variables in my dynamically loaded libraries, I wrote the following test.
At first we need a statically linked library: The header test.hpp
#ifndef __BASE_HPP
#define __BASE_HPP
#include <iostream>
class test {
private:
int value;
public:
test(int value) : value(value) {
std::cout << "test::test(int) : value = " << value << std::endl;
}
~test() {
std::cout << "test::~test() : value = " << value << std::endl;
}
int get_value() const { return value; }
void set_value(int new_value) { value = new_value; }
};
extern test global_test;
#endif // __BASE_HPP
and the source test.cpp
#include "base.hpp"
test global_test = test(1);
Then I wrote a dynamically loaded library: library.cpp
#include "base.hpp"
extern "C" {
test* get_global_test() { return &global_test; }
}
and a client program loading this library: client.cpp
#include <iostream>
#include <dlfcn.h>
#include "base.hpp"
typedef test* get_global_test_t();
int main() {
global_test.set_value(2); // global_test from libbase.a
std::cout << "client: " << global_test.get_value() << std::endl;
void* handle = dlopen("./liblibrary.so", RTLD_LAZY);
if (handle == NULL) {
std::cout << dlerror() << std::endl;
return 1;
}
get_global_test_t* get_global_test = NULL;
void* func = dlsym(handle, "get_global_test");
if (func == NULL) {
std::cout << dlerror() << std::endl;
return 1;
} else get_global_test = reinterpret_cast<get_global_test_t*>(func);
test* t = get_global_test(); // global_test from liblibrary.so
std::cout << "liblibrary.so: " << t->get_value() << std::endl;
std::cout << "client: " << global_test.get_value() << std::endl;
dlclose(handle);
return 0;
}
Now I compile the statically loaded library with
g++ -Wall -g -c base.cpp
ar rcs libbase.a base.o
the dynamically loaded library
g++ -Wall -g -fPIC -shared library.cpp libbase.a -o liblibrary.so
and the client
g++ -Wall -g -ldl client.cpp libbase.a -o client
Now I observe: The client and the dynamically loaded library possess a different version of the variable global_test. But in my project I'm using cmake. The build script looks like this:
CMAKE_MINIMUM_REQUIRED(VERSION 2.6)
PROJECT(globaltest)
ADD_LIBRARY(base STATIC base.cpp)
ADD_LIBRARY(library MODULE library.cpp)
TARGET_LINK_LIBRARIES(library base)
ADD_EXECUTABLE(client client.cpp)
TARGET_LINK_LIBRARIES(client base dl)
analyzing the created makefiles I found that cmake builds the client with
g++ -Wall -g -ldl -rdynamic client.cpp libbase.a -o client
This ends up in a slightly different but fatal behavior: The global_test of the client and the dynamically loaded library are the same but will be destroyed two times at the end of the program.
Am I using cmake in a wrong way? Is it possible that the client and the dynamically loaded library use the same global_test but without this double destruction problem?
g++ -Wall -g -ldl -rdynamic client.cpp libbase.a -o client
CMake adds -rdynamic option allowing loaded library to resolve symbols in the loading executable... So you can see that this is what you don't want. Without this option it just misses this symbol by accident.
But... You should not do any stuff like that there. Your libraries and executable should
not share symbols unless they are really should be shared.
Always think of dynamic linking as static linking.
If using shared libraries you must define the stuff you want to export with macro like here. See DLL_PUBLIC macro definition in there.
By default, the linker won't combine a global variable (a 'D') in the base executable with one in a shared library. The base executable is special. There might be an obscure way to do this with one of those obscure control files that ld reads, but I sort of doubt it.
--export-dynamic will cause a.out 'D' symbols to be available to shared libs.
However, consider the process. Step 1: you create a DSO from a .o with a 'U' and a .a with a 'D'. So, the linker incorporates the symbol in the DSO. Step 2, you create the executable with a 'U' in one of the .o files, and 'D' in both a .a and the DSO. It will try to resolve using the left-to-right rule.
Variables, as opposed to functions, pose certain difficulties for the linker across modules in any case. A better practice is to avoid global var references across module boundaries, and use function calls. However, that would still fail for you if you put the same function in both the base executable and a shared lib.
My first question is if there is any particular reason for which you both statically and dynamically (via dlopen) link the same code?
For your problem: -rdynamic will export the symbols from your program and what probably is happening is that dynamic linker resolves all references to your global variable to the first symbol it encounters in symbol tables. Which one is that I don't know.
EDIT: given your purpose I would link your program that way:
g++ -Wall -g -ldl client.cpp -llibrary -L. -o client
You may need to fix the order.
I would advise to use a dlopen(... RTLD_LAZY|RTLD_GLOBAL); to merge global symbol tables.
I would propose to compile any .a static library which you plan to link to a dinamic library, with -fvisibility=hidden parameter, so:
g++ -Wall -fvisibility=hidden -g -c base.cpp