initialisation of static object when linking against a static library - c++

What are the rules for initialisation of static object declared in another shared library? For instance, consider the following:
file X.hpp:
struct X {
X ();
static X const s_x;
};
struct Y {
Y (X const &) {}
};
file X.cpp:
#include "X.hpp"
#include <iostream>
X::X ()
{
std::cout << "side effect";
}
X const X::s_x;
I compiled X.cpp in a static library libX.a, and I tried to link the following executable against it (file main.cpp):
#include "X.hpp"
int main ()
{
(void)X::s_x; // (1)
X x = s_x; // (2)
Y y = s_x; // (3)
}
with only (1) or (2), nothing happens. But if I add (3), the static object is initialised (i.e. "side effect" is printed). (I use gcc 4.6.1).
Is there any way to predict what will happen here?
I don't understand how the instruction (2) does not force the X::s_x object to be default-constructed, whereas (3) does.
EDIT: build commands:
g++ -c X.cpp
g++ -c main.cpp
ar rcs libX.a X.o
g++ -o test main.o -L. -lX

By default on many platforms, if your program doesn't reference any symbols from a given object file in a static library, the whole object file (including static initializers) will be dropped. So the linker is ignoring X.o in libX.a because it looks like it is unused.
There are a few solutions here:
Don't depend on the side-effects of static initializers. This is the most portable/simple solution.
Introduce some fake dependency on each file by referencing a dummy symbol in a way the compiler will not see through (like storing the address into a externally-visible global).
Use some platform-specific trick to retain the objects in question. For example, on Linux you can use -Wl,-whole-archive a.o b.a -Wl,-no-whole-archive.

Related

Why is non-empty initialized static member std::string empty (in shared lib)?

Following is a simplification of code from a larger project:
// foo.h
#ifndef FOO_H
#define FOO_H
#include <string>
class Foo
{
public:
Foo( const std::string& s = magic_ );
void func();
static const std::string magic_;
private:
std::string s_;
};
void func( const std::string& s = Foo::magic_ );
#endif
//foo.cpp
#include "foo.h"
#include <iostream
const std::string Foo::magic_ = "please";
Foo::Foo( const std::string& s )
: s_( s )
{ }
void Foo::func() { std::cout << "[" << s_ << "]" << std::endl; }
void func( const std::string& s )
{
Foo( s ).func();
}
// main.cpp
#include "foo.h"
int main( int argc, char* argv[] )
{
func();
return 0;
}
I'll call the above a MCE, not a MCVE because unfortunately I've not been able to reproduce the problem in a simplification, which I can only guess is because of quirks of static variables and shared linkage - possibly an incorrect assessment, because of a not-thorough understanding of what's involved. But I will try to explain the problem. The following aspects of the above MCE are representative of the problem code:
A header file declares a class with a static member std::string.
The same header defines a function with an in-arg defaulted to the class' static member variable.
The header's corresponding source file defines the static member and function.
The translation unit is compiled to a shared library.
The executable's object code is linked with the shared library.
Compilation/output:
$ g++ -O3 -c -fPIC -o ./foo.o ./foo.cpp
$ g++ -O3 -shared ./foo.o -o ./libfoo.so
$ g++ -O3 -c -fPIE -o ./main.o ./main.cpp
$ g++ -O3 ./main.o -o ./a.out -L./ -lfoo
$ LD_LIBRARY_PATH=$LD_LIBRARY_PATH: ./a.out
[please]
Though not demonstrated above, is it a possibility that during executing the output may be "[]"? I.e. might Foo::magic_ be empty at the time used as the default value assigned to void func()'s in-arg?
This is hard to articulate in the absence of a demonstrable problem with the above MCE, but assuming it is truly representative of the problem code, I observe (by stdout due to gdb absence in the real build/test environment) that in void func(), in-arg s is an empty string - can anyone account for/explain why this could be?
I know initialization of static variables is not guaranteed across translation units - is that possibly involved here? (It seems like it might explain why the problem is not reproducible in an attempted simplification)
Again, apologies for the lack of a MCVE - I tried my best to create one.

Singleton across compilation units: linking library vs linking objects

I apologize if the title is not fully self-explanatory. I'm trying to understand why my singleton factory pattern is not working properly, and I ran into a bizarre difference when using library vs linking single objects files.
Here's a simplified version of the code:
main.cpp
#include <iostream>
#include "bar.hpp"
int main (int /*argc*/, char** /*argv*/)
{
A::get().print();
return 0;
}
bar.hpp
#ifndef BAR_HPP
#define BAR_HPP
#include <iostream>
class A
{
public:
static A& get ()
{
static A a;
return a;
}
bool set(const int i)
{
m_i = i;
print();
return true;
}
void print ()
{
std::cout << "print: " << m_i << "\n";
}
private:
int m_i;
A () : m_i(0) {}
};
#endif // BAR_HPP
baz.hpp
#ifndef BAZ_HPP
#define BAZ_HPP
#include "bar.hpp"
namespace
{
static bool check = A::get().set(2);
}
#endif // BAZ_HPP
baz.cpp
#include "baz.hpp"
Now, I build my "project" in two ways:
Makefile:
all:
g++ -std=c++11 -c baz.cpp
g++ -std=c++11 -o test main.cpp baz.o
lib:
g++ -std=c++11 -c baz.cpp
ar rvs mylib.a baz.o
g++ -std=c++11 -o test main.cpp mylib.a
Here are the outputs I get:
$ make all
$ ./test
print: 2
print: 2
$ make lib
$ ./test
print: 0
In the first case the call to A::get().set(2) in baz.hpp takes place, and the same instantiation of A is then used in the main function, which therefore prints 2. In the second case, the call to A::get().set(2) in baz.hpp never takes place, and in the main function the value set by the constructor (that is, 0) is printed.
So finally I can ask my question: why is the behavior different in the two cases? I would expect that either both print 0 once or print 2 twice. I always assumed that a library was just a compact way to ship object files, and that the behavior of linking mylib.a should be the same as that of linking baz.o directly. Why isn't that the case?
Edit: the reason, as explained by Richard, is that no symbols defined in baz.cpp are required in main.cpp, so baz.o is not extracted from the library and linked. This raises another question: is there a workaround to ensure that the instruction A::get().set(2) is executed? I would like to avoid making the singleton a global object, but I'm not sure it's possible. I would also like to avoid to include baz.hpp in the main, since there may be many bazxyz.hpp and that would require main.cpp to know in advance all of them, defying the whole purpose of the factory-like registration process...
If this is to be a static library, then some module somewhere is going to have to address something in each implementation file of the objects that are going to register themselves with the factory.
A reasonable place for this would be in bar.cpp (which is a file you don't yet have). It would contain some or all of the implementation of A plus some means of calling the registration functions the widgets you're going to create.
Self-discovery only works if the object files are linked into the executable. This gives the c++ startup sequence a chance to know about and construct all objects with global linkage.

Is there any linker flag that tells to defer the loading of dynamic library after dlopen [duplicate]

This question already has answers here:
C++ static initialization order
(6 answers)
Closed 6 years ago.
I have following code
File hello.cc
static A dummyl;
A:: A() {
fun();
}
void A::fun() {
int y = 10;
int z = 20;
int x = y + z;
}
File hello.h
class A {
public:
A a;
void fun();
};
File main.cc
#include <dlfcn.h>
#include "hello.h"
typedef void (*pf)();
int main() {
void *lib;
pf greet;
const char * err;
printf("\n Before dlopen\n");
lib = dlopen("libhello.so", RTLD_NOW | RTLD_GLOBAL);
if (!lib) {
exit(1);
}
A *a = new A ;
a->fun();
dlerror(); /*first clear any previous error; redundant in this case but a useful habit*/
dlclose(lib);
return 0;
}
Build phases:
g++ -fPIC -c hello.cc
g++ -shared -o libhello.so hello.o
g++ -o myprog main.cc -ldl -L. -lhello
Since my shared library in real case is libQtCore.so , I need to link it as using -lQtCore in linker because I cannot use the symbols directly and since there are many of functions then libQtCore, it will not practically advisable to use dlysym for each function of libQtCore.so
Since I link, my static global variables gets initialized before dlopen. Is there any flag for linker g++ that tells compiler to load the shared library only after _dlopen _?
Is there any flag for linker g++ that tells compiler to load the shared library only after _dlopen _?
Yes. On Windows. It's known as delay loading. It doesn't exist on Linux/OS X. It's possible to implement it, but it'd require modifications to binutils and ld.so. See this article for some background on that.
But you don't need to care about any of it. Really.
You are facing a well known problem. So well known that it even has a name: The Static Initialization Order Fiasco.
The solution to all your woes is trivial: don't use static global variables the way you do.
To fix it, use Q_GLOBAL_STATIC, or reimplement something like it yourself. That way the value will be constructed at the time it's first used, not prematurely. That's all there's to it.
Side note: Your recent questions suffer badly by being an XY Problem. You're trying to come up with all sorts of Rube Goldberg-esque solutions to a rather simple issue that it took you a week+ to finally divulge. Ask not about a potential solution, but about the underlying issue is you're attempting to solve. All the library loading stuff is completely and utterly unrelated to your problem, and you don't need to concern yourself with it at all.

Template class with static members across multiple DLL/DSO

There were many questions about C++ template classes which contain static member variables, as well as about exporting them from dynamic libraries or shared objects. But this one is a bit deeper: what to do if there are multiple shared objects, each of them having its own set of instantiations but possibly using instantiations from another shared object?
Consider the following example code:
/* file: common.h */
#include <stdio.h>
#define PRINT fprintf (stderr, "(template %d) %d -> %d\n", parameter, data, new_data)
template <int parameter>
class SharedClass
{
static int data;
public:
static void Set(int new_data) { PRINT; data = new_data; }
};
template <int parameter>
int SharedClass<parameter>::data = parameter;
/* file: library1.h */
extern template class SharedClass<1>;
void Library1Function();
/* file: library1.cpp */
#include "common.h"
#include "library1.h"
#include "library2.h"
template class SharedClass<1>;
void Library1Function()
{
SharedClass<1>::Set (100);
SharedClass<2>::Set (200);
}
/* file: library2.h */
extern template class SharedClass<2>;
void Library2Function();
/* file: library2.cpp */
#include "common.h"
#include "library1.h"
#include "library2.h"
template class SharedClass<2>;
void Library2Function()
{
SharedClass<1>::Set (1000);
SharedClass<2>::Set (2000);
}
/* file: main.cpp */
#include "common.h"
#include "library1.h"
#include "library2.h"
int main()
{
Library1Function();
Library2Function();
SharedClass<1>::Set (-1);
SharedClass<2>::Set (-2);
}
Let's then assume we build the two libraries and an application using GCC:
$ g++ -fPIC -fvisibility=default -shared library1.cpp -o lib1.so
$ g++ -fPIC -fvisibility=default -shared library2.cpp -o lib2.so
$ g++ -fvisibility=default main.cpp -o main -Wl,-rpath=. -L. -l1 -l2
And then run the executable, we'll get the following result:
$ ./main
(template 1) 1 -> 100
(template 2) 2 -> 200
(template 1) 100 -> 1000
(template 2) 200 -> 2000
(template 1) 1000 -> -1
(template 2) 2000 -> -2
Which means that both libraries and the executable access the same per-template static storage.
If we run "nm -C" on the binaries, we'll see that each static member is defined only once and in the corresponding library:
$ nm -C -A *.so main | grep ::data
lib1.so:0000000000001c30 u SharedClass<1>::data
lib2.so:0000000000001c30 u SharedClass<2>::data
But I've got some questions.
Why, if we remove the extern template class ... from both headers, we'll see that the static members are present in each binary, but the test application will continue to work properly?
$ nm -C -A *.so main | grep ::data
lib1.so:0000000000001c90 u SharedClass<1>::data
lib1.so:0000000000001c94 u SharedClass<2>::data
lib2.so:0000000000001c94 u SharedClass<1>::data
lib2.so:0000000000001c90 u SharedClass<2>::data
main:0000000000401e48 u SharedClass<1>::data
main:0000000000401e4c u SharedClass<2>::data
Is it possible to build this under MSVC?
Or, more specifically, how to deal with __declspec(dllexport) and __declspec(dllimport) to make some instantiations exported, and some - imported?
And, finally: is this an example of undefined behavior?
To answer point 1: When the dynamic linker resolves symbols, it uses a list of modules to link against. The first module loaded is checked first, then the second, and so on.
IIRC, when the data member is used in main, lib1.so, and lib2.so, this is still treated as a dynamic symbol reference, even though the member is declared in the same module. So when the linker goes to resolve the symbols when you run the program, all three modules wind up using the data member implementation in just one of the three modules: whichever was loaded first. The other two pairs are still loaded into memory, but are unused.
(Try std::cout << &(SharedClass<n>::data) << std::endl in all three modules; the address printed should be the same for all six cases.)
To answer point 3, I don't believe this behavior is undefined at all. What happens exactly depends on your system's dynamic linker, but I don't know of any linker that wouldn't handle this situation in exactly the same way.
I can't speak to point 2 since I don't have a whole lot of experience with MSVC.

Static initialization and destruction of a static library's globals not happening with g++

Until some time ago, I thought a .a static library was just a collection of .o object files, just archiving them and not making them handled differently. But linking with a .o object and linking with a .a static library containing this .o object are apparently not the same. And I don't understand why...
Let's consider the following source code files:
// main.cpp
#include <iostream>
int main(int argc, char* argv[]) {
std::cout << "main" << std::endl;
}
// object.hpp
#include <iostream>
struct Object
{
Object() { std::cout << "Object constructor called" << std::endl; }
~Object() { std::cout << "Object destructor called" << std::endl; }
};
// object.cpp
#include "object.hpp"
static Object gObject;
Let's compile and link and run this code:
g++ -Wall object.cpp main.cpp -o main1
./main1
> Object constructor called
> main
> Object destructor called
The constructor an the destructor of the global gObject object is called.
Now let's create a static library from our code and use (link) it in another program:
g++ -Wall -c object.cpp main.cpp
ar rcs lib.a object.o
g++ -Wall -o main2 main.o lib.a
./main2
> main
gObject's constructor and destructor are not called... why?
How to have them automatically called?
Thanks.
.a static libraries contain several .o but they are not linked in unless you reference them from the main app.
.o files standalone link always.
So .o files in the linker always go inside, referenced or not, but from .a files only referenced .o object files are linked.
As a note, static global objects are not required to be initialized till you actually reference anything in the compilation unit, most compilers will initialize all of them before main, but the only requirement is that they get initialized before any function of the compilation unit gets executed.