Static variable is initialized twice - c++

Consider I have a static variable in a compilation unit which ends up in a static library libA. I then have another compilation unit accessing this variable which ends up in a shared library libB.so (so libA must be linked into libB). Finally I have a main function also accessing the static variable from A directly and having a dependency to libB (so I link against libA and libB).
I then observe, that the static variable is initialized twice, i.e. its constructor is run twice! This doesn't seem to be right. Shouldn't the linker recognize both variables to be the same and optimize them as one?
To make my confusion perfect, I see it is run twice with the same address! So maybe the linker did recognize it, but did not remove the second call in the static_initialization_and_destruction code?
Here's a showcase:
ClassA.hpp:
#ifndef CLASSA_HPP
#define CLASSA_HPP
class ClassA
{
public:
ClassA();
~ClassA();
static ClassA staticA;
void test();
};
#endif // CLASSA_HPP
ClassA.cpp:
#include <cstdio>
#include "ClassA.hpp"
ClassA ClassA::staticA;
ClassA::ClassA()
{
printf("ClassA::ClassA() this=%p\n", this);
}
ClassA::~ClassA()
{
printf("ClassA::~ClassA() this=%p\n", this);
}
void ClassA::test()
{
printf("ClassA::test() this=%p\n", this);
}
ClassB.hpp:
#ifndef CLASSB_HPP
#define CLASSB_HPP
class ClassB
{
public:
ClassB();
~ClassB();
void test();
};
#endif // CLASSB_HPP
ClassB.cpp:
#include <cstdio>
#include "ClassA.hpp"
#include "ClassB.hpp"
ClassB::ClassB()
{
printf("ClassB::ClassB() this=%p\n", this);
}
ClassB::~ClassB()
{
printf("ClassB::~ClassB() this=%p\n", this);
}
void ClassB::test()
{
printf("ClassB::test() this=%p\n", this);
printf("ClassB::test: call staticA.test()\n");
ClassA::staticA.test();
}
Test.cpp:
#include <cstdio>
#include "ClassA.hpp"
#include "ClassB.hpp"
int main(int argc, char * argv[])
{
printf("main()\n");
ClassA::staticA.test();
ClassB b;
b.test();
printf("main: END\n");
return 0;
}
I then compile and link as follows:
g++ -c ClassA.cpp
ar rvs libA.a ClassA.o
g++ -c ClassB.cpp
g++ -shared -o libB.so ClassB.o libA.a
g++ -c Test.cpp
g++ -o test Test.cpp libA.a libB.so
Output is:
ClassA::ClassA() this=0x804a040
ClassA::ClassA() this=0x804a040
main()
ClassA::test() this=0x804a040
ClassB::ClassB() this=0xbfcb064f
ClassB::test() this=0xbfcb064f
ClassB::test: call staticA.test()
ClassA::test() this=0x804a040
main: END
ClassB::~ClassB() this=0xbfcb064f
ClassA::~ClassA() this=0x804a040
ClassA::~ClassA() this=0x804a040
Can somebody please explain what is going on here? What is the linker doing? How can the same variable be initialized twice?

You are including libA.a into libB.so. By doing this, both libB.so and libA.a contain ClassA.o, which defines the static member.
In the link order you specified, the linker pulls in ClassA.o from the static library libA.a, so ClassA.o initialization code is run before main(). When the first function in the dynamic libB.so is accessed, all initializers for libB.so are run. Since libB.so includes ClassA.o, ClassA.o's static initializer must be run (again).
Possible fixes:
Don't put ClassA.o into both libA.a and libB.so.
g++ -shared -o libB.so ClassB.o
Don't use both libraries; libA.a is not needed.
g++ -o test Test.cpp libB.so
Applying either of the above fixes the problem:
ClassA::ClassA() this=0x600e58
main()
ClassA::test() this=0x600e58
ClassB::ClassB() this=0x7fff1a69f0cf
ClassB::test() this=0x7fff1a69f0cf
ClassB::test: call staticA.test()
ClassA::test() this=0x600e58
main: END
ClassB::~ClassB() this=0x7fff1a69f0cf
ClassA::~ClassA() this=0x600e58

Can somebody please explain what is going on here?
It's complicated.
First, the way that you linked your main executable and the shared library causes two instances of staticA (and all the other code from ClassA.cpp) to be present: one in the main executable, and another in libB.so.
You can confirm this by running
nm -AD ./test ./libB.so | grep staticA
It is then not very surprising that the ClassA constructor for the two instances runs two times, but it is still surprising that the this pointer is the same (and corresponds to staticA in the main executable).
That is happening because the runtime loader (unsuccessfully) tries to emulate the behavior of linking with archive libraries, and binds all references to staticA to the first globally-exported instance it observes (the one in test).
So what can you do to fix this? That depends on what staticA actually represents.
If it is some kind of singleton, that should only exist once in any program, then the easy solution is make it so that there is only a single instance of staticA. And a way to do that is to require that any program that uses libB.so also links against libA.a, and not link libB.so against libA.a. That will eliminate the instance of sttaicA inside libB.so. You've claimed that "libA must be linked into libB", but that claim is false.
Alternatively, if you build libA.so instead of libA.a, then you can link libB.so against libA.so (so libB.so is self-contained). If the main application also links against libA.so, that wouldn't be a problem: there will only be one instance of staticA inside libA.so, not matter how many times that library is used.
On the other hand, if staticA represents some kind of internal implementation detail, and you are ok with having two instances of it (so long as they don't interfere with each other), then the solution is to mark all of ClassA symbols with hidden visibility, as this answer suggests.
Update:
why the linker does not eliminate the second instance of staticA from the executable.
Because the linker does what you told it to do. If you change your link command line to:
g++ -o test Test.cpp libB.so libA.a
then the linker should not link ClassA into the main executable. To understand why the order of libraries on command line matters, read this.

Related

Singleton across compilation units: linking library vs linking objects

I apologize if the title is not fully self-explanatory. I'm trying to understand why my singleton factory pattern is not working properly, and I ran into a bizarre difference when using library vs linking single objects files.
Here's a simplified version of the code:
main.cpp
#include <iostream>
#include "bar.hpp"
int main (int /*argc*/, char** /*argv*/)
{
A::get().print();
return 0;
}
bar.hpp
#ifndef BAR_HPP
#define BAR_HPP
#include <iostream>
class A
{
public:
static A& get ()
{
static A a;
return a;
}
bool set(const int i)
{
m_i = i;
print();
return true;
}
void print ()
{
std::cout << "print: " << m_i << "\n";
}
private:
int m_i;
A () : m_i(0) {}
};
#endif // BAR_HPP
baz.hpp
#ifndef BAZ_HPP
#define BAZ_HPP
#include "bar.hpp"
namespace
{
static bool check = A::get().set(2);
}
#endif // BAZ_HPP
baz.cpp
#include "baz.hpp"
Now, I build my "project" in two ways:
Makefile:
all:
g++ -std=c++11 -c baz.cpp
g++ -std=c++11 -o test main.cpp baz.o
lib:
g++ -std=c++11 -c baz.cpp
ar rvs mylib.a baz.o
g++ -std=c++11 -o test main.cpp mylib.a
Here are the outputs I get:
$ make all
$ ./test
print: 2
print: 2
$ make lib
$ ./test
print: 0
In the first case the call to A::get().set(2) in baz.hpp takes place, and the same instantiation of A is then used in the main function, which therefore prints 2. In the second case, the call to A::get().set(2) in baz.hpp never takes place, and in the main function the value set by the constructor (that is, 0) is printed.
So finally I can ask my question: why is the behavior different in the two cases? I would expect that either both print 0 once or print 2 twice. I always assumed that a library was just a compact way to ship object files, and that the behavior of linking mylib.a should be the same as that of linking baz.o directly. Why isn't that the case?
Edit: the reason, as explained by Richard, is that no symbols defined in baz.cpp are required in main.cpp, so baz.o is not extracted from the library and linked. This raises another question: is there a workaround to ensure that the instruction A::get().set(2) is executed? I would like to avoid making the singleton a global object, but I'm not sure it's possible. I would also like to avoid to include baz.hpp in the main, since there may be many bazxyz.hpp and that would require main.cpp to know in advance all of them, defying the whole purpose of the factory-like registration process...
If this is to be a static library, then some module somewhere is going to have to address something in each implementation file of the objects that are going to register themselves with the factory.
A reasonable place for this would be in bar.cpp (which is a file you don't yet have). It would contain some or all of the implementation of A plus some means of calling the registration functions the widgets you're going to create.
Self-discovery only works if the object files are linked into the executable. This gives the c++ startup sequence a chance to know about and construct all objects with global linkage.

how to remove pthread undefined reference while building single thread library

I am getting undefined reference for pthread API's and I don't know how to resolve them?
Here is the scenario:
libA.a -- this is 3rd party library [it contains lots of API's which are pthread dependent]
libB.a -- This is my own library. I am using few API's of 3rd party library[libA.a] and created my own library.[I myself havn't used any pthread API in libB.a]
I am giving libA.a + libB.a + headers of (A + B) to my client exe. -- say MAIN.exe
MAIN.cpp -- will be using API's provided by my library.
When, I am trying to run MAIN.exe, I am getting undefined reference errors.
Below is the source code:
libA.a: It only contains A.h and A.cpp
A.h
class A
{
public:
void dispA();
void printNumber();
};
A.cpp:
#include "iostream"
#include "A.h"
#include<stdlib.h>
#include "pthread.h"
using namespace std;
void* printNum(void*)
{
sleep(1);
for(int i = 1; i<= 10; i++)
{
cout<<"i: "<<i<<endl;
}
pthread_exit(NULL);
return NULL;
}
void A::dispA()
{
cout<<"A::disp()"<<endl;
}
void A::printNumber()
{
pthread_t t1;
pthread_create(&t1, NULL, &printNum, NULL);
pthread_exit(NULL);
}
Command to create libA.a:
cd /practise/A
g++ -c A.cpp
ar -cvq libA.a *.o
libB.a: It only contains B.h and B.cpp
B.h:
class B
{
public:
void funB();
void dispB();
};
B.cpp:
#include "B.h"
#include "iostream"
#include "A.h"
using namespace std;
void B::funB()
{
cout<<"B::funB()"<<endl;
}
void B::dispB()
{
A a;
a.dispA();
a.printNumber();
}
Command to create libB.a:
cd /practise/B
g++ -c B.cpp -I../A
ar -cvq libB.a *.o
Main.cpp:
#include "iostream"
#include "B.h"
using namespace std;
int main()
{
B b;
b.dispB();
b.funB();
return 0;
}
Command to create main.exe:
cd /practise/MAIN
g++ -o noThread MAIN.cpp -I../A -I../B -L../A -L../B -lB -lA
Error I am getting:
../A/libA.a(A.o): In function A::printNumber()':
A.cpp:(.text+0x8c): undefined reference topthread_create'
collect2: ld returned 1 exit status
NOTE:
I know, if I try to use -lrt flag, it will not give any error.
But the problem is that my client [MAIN.cpp] cannot use -lrt flag or -lpthread or any thread related library. and hence, he has suggested me to provide SINGLE THREAD LIBRARY.
So, How to provide SINGLE THREAD LIBRARY ????
libA.a is third party and I cant change its code.
libB.a is my own library [and I have to use API's from libA.a]
Is there any specific flag which I can use to make main.cpp run properly??
Another Doubt:
Why Main.cpp is giving me error, even when client is only calling thread independent function :
int main()
{
B b;
//b.dispB(); <== commented thread dependent function
b.funB(); <== this doesn't depend on pthread. but still main.cpp is failing. Don't know WHY !!
return 0;
}
If you are certain no actual pthread code gets called from the code path your library uses then you could try making dummy versions of the ptherad calls like this:
DummyPThreads.c (note c not c++)
int pthread_create(pthread_t*, const pthread_attr_t*, void* (*)(void *), void*)
{
return 0;
}
void pthread_exit(void*)
{
}
// etc...
Compile with:
gcc -c -o DummyPThreads.o DummyPThreads.c
Add to your application:
g++ -o noThread MAIN.cpp -I../A -I../B DummyPThreads.o -L../A -L../B -lB -lA
It's not possible. As one of the libraries depend on pthread, you need to link the library to your final executable.
The only option would be to extract the files from libA.a which you really need and they don't depend on pthread. But quite tough task to do, most probably not possible as there are usually cross dependencies and last but not least, highly fragile if the library change for example.

Both static variables and global variables show different addresses in dynamic library and static library on Linux?

I have encountered on CentOS 6.5. As I have searched online that static variable behaves differently on Windows and on Linux when using dynamic library. That is, Windows would cause duplication of variables and Linux would not, like this one:
http://www.yolinux.com/TUTORIALS/LibraryArchives-StaticAndDynamic.html
However, when I wrote a small program to validate this, I found that Linux also causes duplication. Here is my small program, including four files:
(1) A.h
#ifndef A_H
#define A_H
#include <cstdio>
static int b;
extern "C" class A {
public:
int mem;
A() {
printf("A's address: %p\n", this);
printf("B's address: %p\n", &b);
}
void print() {
printf("%p: %d\n", this, mem);
}
~A() {
printf("DELETE A!!!!! %p\n", this);
}
};
extern A a;
#endif
(2) A.cpp
#include "A.h"
A a;
(3) d.cpp
#include "A.h"
extern "C" void exec() {
a.print();
}
(4) main.cpp
#include "A.h"
#include <dlfcn.h>
typedef void (*fptr) ();
int main() {
a.mem = 22;
a.print();
void *handle;
handle = dlopen("d.so", RTLD_LAZY);
fptr exec = reinterpret_cast<fptr>(dlsym(handle, "exec"));
(*exec)();
dlclose(handle);
return 0;
}
Here is how I compile and run my program:
g++ d.cpp A.cpp -shared -rdynamic -o d.so -ldl -I. -fPIC -g -std=c++1y
g++ main.cpp A.cpp -ldl -I. -g -std=c++1y
./a.out
Both the dynamic part d.cpp and the static part main.cpp use the variables a and b declared in A.cpp and A.h. And here is the result of the program on my machine:
A's address: 0x600f8c
B's address: 0x600f90
0x600f8c: 22
A's address: 0x7fb8fe859e4c
B's address: 0x7fb8fe859e50
0x7fb8fe859e4c: 0
DELETE A!!!!! 0x7fb8fe859e4c
DELETE A!!!!! 0x600f8c
This surprises me a lot, because the addresses of global variable a and static variable b should be the same in the dynamic part and the static part. And it seems that modification on a in static part does not effect the a in dynamic part. Would anyone please answer my question, or help find out some mistakes in the program (if any)?
By the way, to be honest, on another project I am working on, I find that addresses of global variables are the same in dynamic library and in static library. But that project is too big and I cannot provide a small program to reproduce the behavior.
Thanks a lot !
The first command you showed builds a shared object d.so. Based on the context of your question, I surmise that you also intended to link with d.so, but your second command seems to be missing that part. I'm assuming that it's a typo, as this is the only explanation for the program output you showed -- that A.cpp is both linked to directly, and is also built into your d.so library.
Given that, quoting from the article you linked:
Object code routines used by both should not be duplicated in each.
This is especially true for code which use static variables such as
singleton classes. A static variable is global and thus can only be
represented once. Including it twice will provide unexpected results.
But that's exactly the rule you seem to be breaking, you're representing the statically-scoped instance of the A class twice, in your d.so, and in your main application executable.
So, that seems to be the indicated outcome: "unexpected results".

initialisation of static object when linking against a static library

What are the rules for initialisation of static object declared in another shared library? For instance, consider the following:
file X.hpp:
struct X {
X ();
static X const s_x;
};
struct Y {
Y (X const &) {}
};
file X.cpp:
#include "X.hpp"
#include <iostream>
X::X ()
{
std::cout << "side effect";
}
X const X::s_x;
I compiled X.cpp in a static library libX.a, and I tried to link the following executable against it (file main.cpp):
#include "X.hpp"
int main ()
{
(void)X::s_x; // (1)
X x = s_x; // (2)
Y y = s_x; // (3)
}
with only (1) or (2), nothing happens. But if I add (3), the static object is initialised (i.e. "side effect" is printed). (I use gcc 4.6.1).
Is there any way to predict what will happen here?
I don't understand how the instruction (2) does not force the X::s_x object to be default-constructed, whereas (3) does.
EDIT: build commands:
g++ -c X.cpp
g++ -c main.cpp
ar rcs libX.a X.o
g++ -o test main.o -L. -lX
By default on many platforms, if your program doesn't reference any symbols from a given object file in a static library, the whole object file (including static initializers) will be dropped. So the linker is ignoring X.o in libX.a because it looks like it is unused.
There are a few solutions here:
Don't depend on the side-effects of static initializers. This is the most portable/simple solution.
Introduce some fake dependency on each file by referencing a dummy symbol in a way the compiler will not see through (like storing the address into a externally-visible global).
Use some platform-specific trick to retain the objects in question. For example, on Linux you can use -Wl,-whole-archive a.o b.a -Wl,-no-whole-archive.

Linking with multiple versions of a library

I have an application that statically links with version X of a library, libfoo, from thirdparty vendor, VENDOR1. It also links with a dynamic (shared) library, libbar, from a different thirdparty vendor, VENDOR2, that statically links version Y of libfoo from VENDOR1.
So libbar.so contains version Y of libfoo.a and my executable contains version X of libfoo.a
libbar only uses libfoo internally and there are no libfoo objects passed from my app to libbar.
There are no errors at build time but at runtime the app seg faults. The reason seems to be that version X uses structures that have a different size they version Y and the runtime linker seems to be mixing up which get used by which.
Both VENDOR1 & VENDOR2 are closed source so I cannot rebuild them.
Is there a way to build/link my app such that it always resolves to version X and libbar alway resolves to version Y and the two never mix?
Thanks for all the responses. I have a solution that seem to be working.
Here's the problem in detail with an example.
In main.c we have:
#include <stdio.h>
extern int foo();
int bar()
{
printf("bar in main.c called\n");
return 0;
}
int main()
{
printf("result from foo is %d\n", foo());
printf("result from bar is %d\n", bar());
}
In foo.c we have:
extern int bar();
int foo()
{
int x = bar();
return x;
}
In bar.c we have:
#include <stdio.h>
int bar()
{
printf("bar in bar.c called\n");
return 2;
}
Compile bar.c and foo.c:
$ gcc -fPIC -c bar.c
$ gcc -fPIC -c foo.c
Add bar.o to a static library:
$ ar r libbar.a bar.o
Now create a shared library using foo.o and link with static libbar.a
$ gcc -shared -o libfoo.so foo.o -L. -lbar
Compile main.c and link with shared library libfoo.so
$ gcc -o main main.c -L. -lfoo
Set LD_LIBRARY_PATH to find libfoo.so and run main:
$ setenv LD_LIBRARY_PATH `pwd`
$ ./main
bar in main.c called
result from foo is 0
bar in main.c called
result from bar is 0
Notice that the version of bar in main.c is called, not the version linked into the shared library.
In main2.c we have:
#include <stdio.h>
#include <dlfcn.h>
int bar()
{
printf("bar in main2.c called\n");
return 0;
}
int main()
{
int x;
int (*foo)();
void *handle = dlopen("libfoo.so", RTLD_GLOBAL|RTLD_LAZY);
foo = dlsym(handle, "foo");
printf("result from foo is %d\n", foo());
printf("result from bar is %d\n", bar());
}
Compile and run main2.c (notice we dont need to explicitly link with libfoo.so):
$ gcc -o main2 main2.c -ldl
$ ./main2
bar in bar.c called
result from foo is 2
bar in main2.c called
result from bar is 0
Now foo in the shared library calls bar in the shared library and main calls bar in main.c
I don't think this behaviour is intuitive and it is more work to use dlopen/dlsym, but it does resolve my problem.
Thanks again for the comments.
Try a partial link so that you have an object file "partial.o" with libbar and libfoo-Y. Use objcopy with "--localize-symbols " to make the symbols in partial.o from libfoo-Y local. You should be able to generate by running nm on libfoo-Y and massaging the output. Then take the modified partial.o and link it to your app.
I've done something similar with gcc toolchain on vxWorks where dynamic libs are not a complication but two versions of the same lib needed to link cleanly into a monolithic app.
Sorry no. My understanding of the way that Linux (and possibly most *nixes) is that that is not possible. The only 'solution' for your problem I can think of, is if you create a proxy app, which exposes what you need from libbar in the form of some IPC. You can then make that proxy load the correct version using LD_LIBRARY_PATH or something simmilar.