Static *template* class member across dynamic library - c++

Edit: the comments below the accepted answer show that it might be an issue with the Android dynamic loader.
I have a header for a template class with a static member. At runtime the address of the static member is used in the library and in the client code. The template is implicitly instantiated both in the library and in the client code. It works fine on Linux and OSX, the symbol is duplicated but marked as "uniqued" as shown by nm (see below).
However when I compile for ARM (Android), the symbol is marked weak in both the DSO and the executable. The loader does not unify and the symbol is effectively duplicated at runtime!
I read these:
two instances of a static member, how could that be?
Static template data members storage
and especially this answer:
https://stackoverflow.com/a/2505528/2077394
and:
http://gcc.gnu.org/wiki/Visibility
but I am still a little bit puzzled. I understand that the attributes for visibility helps to optimize, but I thought it should work by default. I know the C++ standard does not care about shared library, but does it means that using shared libraries breaks the standard? (or at least this implementation is not C++ standard conform?)
Bonus: how can I fix it? (and not using template is not an acceptable answer:))
Header:
template<class T>
struct TemplatedClassWithStatic {
static int value;
};
template<class T>
int TemplatedClassWithStatic<T>::value = 0;
shared.cpp:
#include "TemplateWithStatic.hpp"
int *addressFromShared() {
return &TemplatedClassWithStatic<int>::value;
}
main.cpp:
#include "TemplateWithStatic.hpp"
#include <cstdio>
int *addressFromShared();
int main() {
printf("%p %p\n", addressFromShared(), &TemplatedClassWithStatic<int>::value);
}
And building, looking at the symbols definitions:
producing .so:
g++-4.8 -shared src/shared.cpp -o libshared.so -I include/ -fPIC
compiling and linking main:
g++-4.8 src/main.cpp -I include/ -lshared -L.
symbols are marked as "unique":
nm -C -A *.so a.out | grep 'TemplatedClassWithStatic<int>::value'
libshared.so:0000000000200a70 u TemplatedClassWithStatic<int>::value
a.out:00000000006012b0 u TemplatedClassWithStatic<int>::value
producing .so
~/project/android-ndk-r9/toolchains/arm-linux-androideabi-4.8/prebuilt/darwin-x86_64/bin/arm-linux-androideabi-g++ -o libshared.so src/shared.cpp -I include/ --sysroot=/Users/amini/project/android-ndk-r9/platforms/android-14/arch-arm/ -shared
compiling and linking main
~/project/android-ndk-r9/toolchains/arm-linux-androideabi-4.8/prebuilt/darwin-x86_64/bin/arm-linux-androideabi-g++ src/main.cpp libshared.so -I include/ --sysroot=${HOME}/project/android-ndk-r9/platforms/android-14/arch-arm/ -I ~/project/android-ndk-r9/sources/cxx-stl/gnu-libstdc++/4.8/include -I ~/project/android-ndk-r9/sources/cxx-stl/gnu-libstdc++/4.8/libs/armeabi-v7a/include -I ~/project/android-ndk-r9/sources/cxx-stl/gnu-libstdc++/4.8/include/backward -I ~/project/android-ndk-r9/platforms/android-14/arch-arm/usr/include ~/project/android-ndk-r9/sources/cxx-stl/gnu-libstdc++/4.8/libs/armeabi-v7a/libgnustl_static.a -lgcc
symbols are weak!
nm -C -A *.so a.out | grep 'TemplatedClassWithStatic<int>::value'
libshared.so:00002004 V TemplatedClassWithStatic<int>::value
a.out:00068000 V TemplatedClassWithStatic<int>::value
Edit, note for the context: I was playing with OOLua, a library helping binding C++ to Lua and my unittests were failing when I started to target Android. I don't "own" the code and I would rather modifying it deeply.
Edit, to run it on Android:
adb push libshared.so data/local/tmp/
adb push a.out data/local/tmp/
adb shell "cd data/local/tmp/ ; LD_LIBRARY_PATH=./ ./a.out"
0xb6fd7004 0xb004

Android does not support unique symbols. It is a GNU extension of ELF format that only works with GLIBC 2.11 and above. Android does not use GLIBC at all, it employs a different C runtime called Bionic.
(update) If weak symbols don't work for you (end update) I'm afraid you would have to modify the code such that it does not rely on static data.

There may be some compiler/linker settings that you can tweak to enable this (have you looked at the -fvisibility flag?).
Possibly a GCC attribute modifier may be worth trying (explicitly set __attribute__ ((visibility ("default"))) on the variable).
Failing that, the only workarounds I could suggest are: (all are somewhat ugly):
Explicitly instantiate all forms of the template that are created in the shared library and provide the initializers in its implementation (not in the header). This may or may not work.
Like (1) but use a shim function as a myers singleton for the shared variable (example below).
Allocate a variable in a map for the class based upon rtti (which might also fail across a shared library boundary).
e.g.
template<class T>
struct TemplatedClassWithStatic {
static int& getValue() { return TemplatedClassWithStatic_getValue((T const*)0); }
};
// types used by the shared library.. can be forward declarations here but you run the risk of violating ODR.
int& TemplatedClassWithStatic_getValue(TypeA*);
int& TemplatedClassWithStatic_getValue(TypeB*);
int& TemplatedClassWithStatic_getValue(TypeC*);
shared.cpp
int& TemplatedClassWithStatic_getValue(TypeA*) {
static int v = 0;
return v;
}
int& TemplatedClassWithStatic_getValue(TypeB*) {
static int v = 0;
return v;
}
int& TemplatedClassWithStatic_getValue(TypeC*) {
static int v = 0;
return v;
}
The executable would also have to provide implementations for any types that it uses to instantiate the template.

Related

relocation against xxx in read-only section '.text' - wrong compiler or linux setup in SUSE?

I'm not a frequent user of Linux and I think I did something wrong.
This is the code for a test dynamic library ".so" I'm generating.
class InternalClass
{
public:
int Function(){ return 10; }
};
extern "C"
{
int WrapperFunctionSimple() { return 10; }
void WrapperCreateInstance() {InternalClass* item = new InternalClass(); delete item; }
}
The compilation fails with the following error:
g++ -Wall -fexceptions -O2 -c /home/lidia/compartida/TestLibrary/TestLibrary/main.cpp -o obj/Release/main.o
g++ -shared obj/Release/main.o -o bin/Release/libTestLibrary.so -s
/usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: obj/Release/main.o: warning: relocation against `_Znwm##GLIBCXX_3.4' in read-only section `.text'
/usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: obj/Release/main.o: relocation R_X86_64_PC32 against symbol `_Znwm##GLIBCXX_3.4' can not be used when making a shared object; recompile with -fPIC
/usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: final link failed: bad value
collect2: error: ld returned 1 exit status
I tried with -fPIC as suggested and it compiles. But when using this library, It cannot be loaded when I add that last function:
void WrapperCreateInstance() {InternalClass* item = new InternalClass(); delete item; }
The problem is using the InternalClass, without this function everything works.
I'm using VirtualBox. I installed OpenSUSE 64bit, the app that uses the library is also 64bit. In another linux distribution (Mint), with exactly the same project and settings (without fPIC), it can be compiled. When I use that library (.so) it works in SUSE.
I'm also using :
gcc (SUSE Linux) 7.5.0
g++ (SUSE Linux) 7.5.0
My IDE is Code::Blocks 20 (last version). Settings are empty except for the -m64 flag.
What am I doing wrong? This seems like something advanced Linux users could help me understand.
EDIT:
To add more information, this can compile in Ubuntu with the same settings. Not in SUSE
To me it happened, if one library (A) depends on another one (B) and library A was linked before library B. The solution is to link library B first and then A.
It happened to me when I used GCC to compile a CPP file. So for C++ files use g++ only, and obviously not GCC, which is meant for c files.
I am using cmake and GCC in ubuntu and I got the same error.
In my case, I added the XX.h file to include directories, and used in my main.cpp a "typedef struct x" which is defined in XX.h header file. However I forgot to add XX.c to the executable sources.
This error is cleared when I added XX.c to the target executable sources in add_executable() of Cmake.
I hope I can clearly state my case.
In my case this happened when I had an abstract class A and two derived classes B and C like below.
class BaseAbstractClass
{
public:
virtual void doNothing(void) = 0;
};
class A : public BaseAbstractClass
{
public:
void doNothing(void){
return;
}
};
class B : public BaseAbstractClass
{
public:
void doNothing(void); // Only declaration, definition is nowhere
};
But one of derived classes (class B here) has only declaration of doNothing(), but lacks definition anywhere in project files.
I had a similar issue and the order of linking answer was useful for me. It ended up being I was including the cpp file rather than the header (include was called at wrong time.)

GNU linker: Adapt to change of name mangling algorithm

I am trying to re-compile an existing C++ application.
Unfortunately, I must rely on a proprietary library I only have a pre-compiled static archive of.
I use g++ version 7.3.0 and ld version 2.30.
Whatever GCC version it was compiled with, it is ancient.
The header file defines the method:
class foo {
int bar(int & i);
}
As nm lib.a shows, the library archive contains the corresponding exported function:
T bar__4fooRi
nm app.o shows my recent compiler employing a different kind of name mangling:
U _ZN4foo9barERi
Hence the linker cannot resolve the symbols provided by the library.
Is there any option to chose the name mangling algorithm?
Can I introduce a map or define the mangled names explicitly?
#Botje's suggestion lead me to writing a linker script like this (the spaces in the PROVIDE stanza are significant):
EXTERN(bar__4fooRi);
PROVIDE(_ZN4foo9barERi = bar__4fooRi);
As far as I understood, this will regard bar__4fooRi as an externally defined symbol (which it is). If _ZN4foo9barERi is searched for, but not defined, bar__4fooRi will take its place.
I am calling the linker from the GNU toolchain like this (mind the order – the script needs to be after the dependant object but before the defining library):
g++ -o application application.o script.ld -lfoo
It looks like this could work.
At least in theory.
The linker now regards other parts of the library, which in turn depends on other unresolvable symbols including (but not limited to) __throw, __cp_pop_exception, and __builtin_delete. I have no idea where these functions are defined nowadays. Joxean Koret shows some locations in this blog post based on guesswork (__builtin_new probably is malloc) – but I am not that confident.
These findings lead me to the conclusion that the library relies on a different style of exception handling and probably memory management, too.
EDIT: The result may be purely academical due to ABI changes as pointed out by #eukaryota, a linker script can indeed be used to "alias" symbols. Here is a complete minimal example:
foo.h:
class Foo {
public:
int bar(int);
};
foo.cpp:
#include "foo.h"
int Foo::bar(int i) {
return i+21;
}
main.cpp:
class Foo {
public:
int baa(int); // use in-place "header" to simulate different name mangling algorithm
};
int main(int, char**) {
Foo f;
return f.baa(21);
}
script.ld:
EXTERN(_ZN3Foo3barEi);
PROVIDE(_ZN3Foo3baaEi = _ZN3Foo3barEi); /* declare "alias" */
Build process:
g++ -o libfoo.o -c foo.c
ar rvs libfoo.a libfoo.o # simulate building a library
g++ -o app main.o -L. script.ld -lfoo
app is compiled, can be executed and returns expected result.

Violating the one definition rule by simply linking dynamically

Question: Are dynamically linked C++ programs on ELF platforms always on the brink of producing undefined behavior by violating the one definition rule?
More specific: By simply writing a shared library exposing one function
#include <string>
int __attribute__((visibility("default"))) combined_length(const char *s,
const char *t)
{
const std::string t1(t);
const std::string u(s + t1);
return u.length();
}
and compiling it with GCC 7.3.0 via
$ g++ -Wall -g -fPIC -shared \
-fvisibility=hidden -fvisibility-inlines-hidden \
-o liblibrary.so library.cpp
I create a binary which defines a weak symbol for the operator+() of a pointer to a character array and a string:
$ readelf -sW liblibrary.so | grep "_ZStpl"
24: 0000000000000ee2 202 FUNC WEAK DEFAULT 12 _ZStplIcSt11char_traitsIcESaIcEENSt7__cxx1112basic_stringIT_T0_T1_EEPKS5_RKS8_
...
But looking at the standard library binary I got
$ readelf -sW /usr/lib/x86_64-linux-gnu/libstdc++.so.6 | grep "_ZStplIcSt11char_traitsIcESaIcEENSt7__cxx1112basic_stringIT_T0_T1_EEPKS5_RKS8_"
2829: 000000000012b1c0 169 FUNC WEAK DEFAULT 13 _ZStplIcSt11char_traitsIcESaIcEENSt7__cxx1112basic_stringIT_T0_T1_EEPKS5_RKS8_##GLIBCXX_3.4.21
That's the point where I say: Oh my gosh, the symbol inside my library ought to have a version attached to it too!
In the current state I'm fine because I can assume that the standard library binary is built with the same headers as my library. But what happens if the implementers of libstdc++-v3 decide to define a new version of this function and tag it with GLIBCXX_3.4.22? Since the symbol is weak, the runtime linker is free to decide whether it takes the unversioned symbol of my library or the versioned symbol of the libstdc++-v3. If I ship my library to such a system I provoke undefined behavior there. Something symbol versions should have solved for me.

CMake "OBJECT" library: clang not linking properly

I have a shared library (which currently compiles, loads and runs) mylib.so. From within this library, I want to use a new function (register it in another external library). The signature is bool my_function(const QVariant *, PyObject **).
This new function is defined in a separate .cpp file which is compiled to an object and then linked to mylib.so.
So I create a new OBJECT with my custom function
ADD_LIBRARY(helper_lib OBJECT helper_lib.cpp)
And include this when building my library
ADD_LIBRARY(mylib SHARED source.cpp $<TARGET_OBJECTS:helper_lib>)
It fails with an "undefined reference to `my_function'"
I can see that
The helper_lib.o file is generated
nm helper_lib.o shows
0000000000000000 T _Z11my_functionPK8QVariantPP7_object
nm mylib.o shows
U my_function
The helper_lib.o is passed to clang++ :
clang++ -fPIC [...] -o my_lib.so mylib.o helper_lib.o [...]
I struggle to see where the mistake happens. I can imagine that there is something wrong in mylib.o which shows an unmangled symbol name which cannot be matched to the helper_lib.o symbol name, but I may as well be totally on the wrong track with this.
helper_lib.h
void my_function();
helper_lib.cpp
#include "helper_lib.h"
void my_function()
{
return;
}
source.cpp is more complicated, as it contains mainly code automatically generated by sip.
It works for me with a simple source.cpp. So it must be that something gets messed up during inclusion, you can try moving #include "helper_lib.h to the top of your source.cpp.
To verify that this has nothing to do with your toolchain, you can try from a clean build directory the following project:
CMakeLists.txt:
cmake_minimum_required(VERSION 3.3)
project(dummy)
ADD_LIBRARY(helper_lib OBJECT helper_lib.cpp)
ADD_LIBRARY(mylib SHARED source.cpp $<TARGET_OBJECTS:helper_lib>)
source.cpp:
#include "helper_lib.h"
void dummy() {
my_function();
}
helper_lib.h:
#pragma once
void my_function();
helper_lib.cpp:
#include "helper_lib.h"
void my_function() {
}
Some documentation.

Solaris shared libraries and global variables

I have an issue with global variables in shared library on Solaris.
Consider following sample:
class Foo
{
public:
Foo() { Init(); }
private:
void Init() { // do something }
};
I have some code in shared library:
Foo g_Foo;
I've noticed that Foo constructor is never called on Solaris while the same code works Linux.
I'm using gcc 3.4.3 and Sun linker.
Are you creating the shared object with the -G flag? e.g.
CC -G -o mylib.so myfile.cpp
If you don't specify -G, then the compiler may not initialise global variables correctly. See compiler documentation here.
Note, the docs also say you can't use ld, but need to use CC to do the linking.