Find an optional symbol from statically linked file with gcc? - c++

I'd like to support some accelerated AVX/SSE functionality, but I'd like to be able to optionally include it. I could use dlsym() along with a shared object, but I'm trying to avoid shared linkage if at all possible in the interest of more portable binaries.
Is there a mechanism I can use to use a static symbol if it's succesfully linked in, and otherwise fall back to a generic function?

You can use a weak symbol, marked with __attribute__ ((weak)), declared like this:
void avx_function (void) __attribute__ ((weak));
And then you check for NULL before calling the function:
if (avx_function != NULL)
avx_function ();
else
fallback_implementation ();
But this doesn't really work for alternative implementations, only for optional functionality which is linked in by other means: You need a mechanism to actually pull in avx_function because the weak symbol will not do that.
But compile-time or link-time feature selection won't get you portable binaries. You are likely better off with run-time checks and indirection through function pointers (for complex feature selection) if you need portable binaries.

If understand correctly, you have some linkage such as:
gcc -o prog main.o ...
in which some avx_func is called, and you'd like it to be so that
if ... does not statically link the genuine definition of avx_func then it will statically link a generic surrogate.
If that's right, you can simply exploit the fact that the linker will resolve
a symbol to the first definition it finds in the linkage sequence, and will not
link any other definition unless it is forced to (in which case a multiple-definition
error is the usual result). So e.g.
main.cpp
extern void avx_func();
int main()
{
avx_func();
return 0;
}
avx_or_not.cpp
#ifdef HAVE_REAL_AVX
#include <iostream>
void avx_func()
{
std::cout << "The real " << __PRETTY_FUNCTION__ << std::endl;
}
#endif
avx_fallback.cpp
#include <iostream>
void avx_func()
{
std::cout << "The fallback " << __PRETTY_FUNCTION__ << std::endl;
}
Make a static library libavxfallback.a:
$ g++ -Wall -Wextra -c avx_fallback.cpp
$ ar rcs libavxfallback.a avx_fallback.o
Compile the other source, assuming real AVX:
$ g++ -Wall -Wextra -DHAVE_REAL_AVX -c avx_or_not.cpp main.cpp
Link a program:
$ g++ -o prog main.o avx_or_not.o -L. -lavxfallback
Run:
$ ./prog
The real void avx_func()
Again compile the other source, this time assuming no real AVX:
$ g++ -Wall -Wextra -c avx_or_not.cpp main.cpp
Relink and rerun:
$ g++ -o prog main.o avx_or_not.o -L. -lavxfallback
$ ./prog
The fallback void avx_func()
At linktime, you do not need to know whether there is anything in the
linkage before -lavxfallback that calls avx_func(), or defines avx_func():
you still know that if it is called, the first definition in the linkage will
be linked, and it will be the one in libavxfallback if there is no earlier one.

Related

dlopen succeeds (or at least seems to) but then dlsym fails to retrieve a symbol from a shared library

In an attempt to undersand how lazily loaded dynamic libraries work, I've made up the following (unfortunately non-working) example.
dynamic.hpp - Header of the library
#pragma once
void foo();
dynamic.cpp - Implementation of the library
#include "dynamic.hpp"
#include <iostream>
void foo() {
std::cout << "Hello world, dynamic library speaking" << std::endl;
}
main.cpp - main function that wants to use the library (edited from the snippet in this question)
#include <iostream>
#include <dlfcn.h>
#include "dynamic.hpp"
int main() {
void * lib = dlopen("./libdynamic.so", RTLD_LAZY);
if (!lib) {
std::cerr << "Error (when loading the lib): " << dlerror() << std::endl;
}
dlerror();
auto foo = dlsym(lib, "foo");
auto error = dlerror();
if (error) {
std::cerr << "Error (when loading the symbol `foo`): " << error << std::endl;
}
dlerror();
using Foo = void (*)();
(Foo(foo)());
}
Compilation and linking¹
# compile main.cpp
g++ -g -O0 -c main.cpp
# compile dynamic.cpp into shared library
g++ -fPIC -Wall -g -O0 -pedantic -shared -std=c++20 dynamic.cpp -o libdynamic.so
# link
g++ -Wall -g -pedantic -L. -ldynamic main.o -o main
Run
LD_LIBRARY_PATH='.' ./main
Error
Error (when loading the symbol `foo`): ./libdynamic.so: undefined symbol: foo
Segmentation fault (core dumped)
As far as I can tell, the error above clearly shows that the library is correctly loaded, but it's the retrieval of the symbol which fails for some reason.
(¹) A few options are redundant or, at least, not necessary. I don't think this really affects what's happening, but if you think so, I can try again with the options you suggest.
auto foo = dlsym(lib, "foo");
Perform the following simple thought experiment: in C++ you can have overloaded functions:
void foo();
void foo(int bar);
So, if your shared library has these two functions, which one would you expect to get from a simple "dlsym(lib, "foo")" and why that one, exactly?
If you ponder and wrap your brain around this simple question you will reach the inescapable conclusion that you must be missing something fundamental. And you are: name mangling.
The actual symbol names used for functions in C++ code are "mangled". That is, if you use objdump and/or nm tools to dump the actual symbols in the shared libraries you will see a bunch of convoluted symbols, with "foo" hiding somewhere in the middle of them.
The mangling is used to encode the "signature" of a function: its name and the type of its parameters, so that different overloads of "foo" produce distinct and unique symbol names.
You need to feed the mangled name into dlsym in order to resolve the symbol.

Find static library unresolved dependencies before linking executable

So let's say we have static library mylib.a, which contains compiled cpp files.
file1.cpp:
int do_stuff();
int func_unres()
{
int a = do_stuff();
return a;
}
file2.cpp:
int do_other_stuff();
int func_res()
{
int a = do_other_stuff();
return a;
}
file3.cpp:
int do_other_stuff()
{
return 42;
}
So, as we can see here, no file contains definition of do_stuff function.
Library created this way:
g++ -c file1.cpp -o file1.o
g++ -c file2.cpp -o file2.o
g++ -c file3.cpp -o file3.o
ar r mylib.a file1.o file2.o file3.o
Now we try to make some binary with this library. Simple example main file:
#include <iostream>
int func_res();
int main()
{
std::cout << func_res() << std::endl;
}
Compiling:
g++ main.cpp mylib.a -o my_bin
Everything works just fine.
Now consider case of main file like this:
#include <iostream>
int func_unres();
int main()
{
std::cout << func_unres() << std::endl;
}
In this case binary won't link, cause func_unres requires function do_stuff to be defined.
Is there a way to find out that static library requires symbol which no object file in the library contains before linking it with executable, which uses such symbol?
EDIT:
The question is not how to simple list such symbols, but to get an output with linker like error.
Like if i linked this library with executable using all of symbols it should contain.
It seems that as pointed in comments and in How to force gcc to link an unused static library, linker option --whole-archive is enough to force resolve all symbols and output linker error for all unresolved symbols in static library. So referring the question examples, compiling and linking this way first main file, which doesn't refer undefined symbol, will output linker error anyway:
g++ main.cpp -Wl,--whole-archive mylib.a -Wl,--no-whole-archive
Linking fails despite main doesn't use func_unres:
mylib.a(file1.o): In function func_unres(): file1.cpp:(.text+0x9):
undefined reference to do_stuff()
Second option --no-whole-archive is used so the rest of required libraries' symbols will not be force resolved like this.

Is it possible to merge weak symbols like vtables/typeinfo across RTLD_LOCAL'ly loaded libraries?

For context: I have a Java project that is partially implemented with two JNI libraries. For the sake of example, libbar.so depends on libfoo.so. If these were system libraries,
System.loadLibrary("bar");
would do the trick. But since they're custom libraries I'm shipping with my JAR, I have to do something like
System.load("/path/to/libfoo.so");
System.load("/path/to/libbar.so");
libfoo needs to go first because otherwise libbar can't find it, as it's not in the system library search path.
This has been working well for a while, but I've now run into an issue where std::any_cast is throwing std::bad_any_cast despite the types being correct. I tracked it down to the fact that both libraries have a different definition of the typeinfo for that type, and they're not being merged at runtime. This seems to be because System.load() ends up invoking dlopen() with RTLD_LOCAL rather than RTLD_GLOBAL.
I wrote this to demonstrate the behaviour without needing JNI:
foo.hpp
class foo { };
extern "C" const void* libfoo_foo_typeinfo();
foo.cpp
#include "foo.hpp"
#include <typeinfo>
extern "C" const void* libfoo_foo_typeinfo()
{
return &typeid(foo);
}
bar.cpp
#include "foo.hpp"
#include <typeinfo>
extern "C" const void* libbar_foo_typeinfo()
{
return &typeid(foo);
}
main.cpp
#include <iostream>
#include <typeinfo>
#include <dlfcn.h>
int main() {
void* libfoo = dlopen("./libfoo.so", RTLD_NOW | RTLD_LOCAL);
void* libbar = dlopen("./libbar.so", RTLD_NOW | RTLD_LOCAL);
auto libfoo_fn = reinterpret_cast<const void* (*)()>(
dlsym(libfoo, "libfoo_foo_typeinfo"));
auto libbar_fn = reinterpret_cast<const void* (*)()>(
dlsym(libbar, "libbar_foo_typeinfo"));
auto libfoo_ti = static_cast<const std::type_info*>(libfoo_fn());
auto libbar_ti = static_cast<const std::type_info*>(libbar_fn());
std::cout << std::boolalpha
<< (libfoo_ti == libbar_ti) << "\n"
<< (*libfoo_ti == *libbar_ti) << "\n";
return 0;
}
Makefile
all: libfoo.so libbar.so main
libfoo.so: foo.cpp
$(CXX) -fpic -shared -Wl,-soname=$# $^ -o $#
libbar.so: bar.cpp
$(CXX) -fpic -shared -Wl,-soname=$# $^ -L. -lfoo -o $#
main: main.cpp
$(CXX) $^ -ldl -o $#
On my system, I get
$ make
...
$ ./main
false
true
This is because even though the typeinfo addresses are different, GCC's libstdc++ uses the mangled names for equality. On LLVM's libc++, for example, equality is based on the typeinfo address itself, so I get:
$ make CXX="clang++ -stdlib=libc++"
$ ./main
false
false
If I pass RTLD_GLOBAL instead, I see
true
true
And if I edit main.cpp to load libbar.so first, it also works, provided I tell it where it can find libfoo.so:
$ LD_LIBRARY_PATH=. ./main
true
true
But for the reasons described at the top of this post, neither of these is a practical workaround.
This is very similar to https://github.com/android-ndk/ndk/issues/533 but with non-dynamic types, so there's no way to add a "key function" to force the typeinfo to be a strong symbol. I happened to reproduce the problem on Android first, but it isn't Android-specific.
No, that is not possible. RTLD_LOCAL seeks to prevent exactly that, and unfortunately must be used for System.loadLibrary since otherwise bad things will happen if you System.loadLibrary two libraries that each define different foo classes.

Static Libraries which depend on other static libraries

I have a question about making static libraries that use other static libraries.
I set up an example with 3 files - main.cpp, slib1.cpp and slib2.cpp. slib1.cpp and slib2.cpp are both compiled as individual static libraries (e.g. I end up with slib1.a and slib2.a) main.cpp is compiled into a standard ELF executable linked against both libraries.
There also exists a header file named main.h which prototypes the functions in slib1 and slib2.
main.cpp calls a function called lib2func() from slib2. This function in turn calls lib1func() from slib1.
If I compile the code as is, g++ will return with a linker error stating that it could not find lib1func() in slib1. However, if I make a call to lib1func() BEFORE any calls to any functions in slib2, the code compiles and works correctly.
My question is simply as follows: is it possible to create a static library that depends on another static library? It would seem like a very severe limitation if this were not possible.
The source code for this problem is attached below:
main.h:
#ifndef MAIN_H
#define MAIN_H
int lib1func();
int lib2func();
#endif
slib1.cpp:
#include "main.h"
int lib1func() {
return 1;
}
slib2.cpp:
#include "main.h"
int lib2func() {
return lib1func();
}
main.cpp:
#include <iostream>
#include "main.h"
int main(int argc, char **argv) {
//lib1func(); // Uncomment and compile will succeed. WHY??
cout << "Ans: " << lib2func() << endl;
return 0;
}
gcc output (with line commented out):
g++ -o src/slib1.o -c src/slib1.cpp
ar rc libslib1.a src/slib1.o
ranlib libslib1.a
g++ -o src/slib2.o -c src/slib2.cpp
ar rc libslib2.a src/slib2.o
ranlib libslib2.a
g++ -o src/main.o -c src/main.cpp
g++ -o main src/main.o -L. -lslib1 -lslib2
./libslib2.a(slib2.o): In function `lib2func()':
slib2.cpp:(.text+0x5): undefined reference to `lib1func()'
collect2: ld returned 1 exit status
gcc output (with line uncommented)
g++ -o src/slib1.o -c src/slib1.cpp
ar rc libslib1.a src/slib1.o
ranlib libslib1.a
g++ -o src/slib2.o -c src/slib2.cpp
ar rc libslib2.a src/slib2.o
ranlib libslib2.a
g++ -o src/main.o -c src/main.cpp
g++ -o main src/main.o -L. -lslib1 -lslib2
$ ./main
Ans: 1
Please, try g++ -o main src/main.o -L. -Wl,--start-group -lslib1 -lslib2 -Wl,--end-group.
Group defined with --start-group, --end-group helps to resolve circular dependencies between libraries.
See also: GCC: what are the --start-group and --end-group command line options?
The order make the difference. Here's from gcc(1) manual page:
It makes a difference where in the command you write this option; the linker searches and processes libraries and object files in the order they are specified. Thus, foo.o -lz bar.o searches library z after file foo.o but before bar.o. If bar.o refers to functions in z, those functions may not be loaded.

Compile shared object library, which call function from so too

I have got a f2.cpp file
// f2.cpp
#include <iostream>
void f2()
{
std::cout << "It's a call of f2 function" << std::endl;
}
I use cygwin with crosstool compiler gcc.
g++ -fPIC -c f2.cpp
g++ -shared -o libf2.so f2.o
I have got a libf2.so file. Now I want to call f2 function in f1 library (shared object too) libf1.so.
It's a f1.cpp and i want take f1.so
// f1.cpp
#include <iostream>
void f1()
{
std::cout << "f1 function is calling f2()..." << std::endl;
f2();
}
How i must compile f1.cpp? I don't want to use dlclose, dlerror, dlopen, dlsym...
Аt last i want to use f1.so in main.cpp as a shared object library too... without using use dlclose, dlerror, dlopen, dlsym. How I must compile main.cpp, when i will have a f1.so ?
// main.cpp
#include <iostream>
int main()
{
f1();
return 0;
}
declare f2() in a header file. and compile libf1.so similar to libf2.
Now compile main linking against f1 and f2.
It should look something like this
g++ -lf2 -lf1 -L /path/to/libs main.o
You can simply link them together (if f2 is compiled into libf2.so, you pass -lf2 to the linker). The linker will take care of connecting calls from f1 to f2. Naturally, at runtime f1 will expect to find f2 in the SO load path and the dynamic loader will load it.
Here's a more complete sample, taken from a portion of a Makefile I found lying around. Here, mylib stands for your f2, and main_linked is f1:
mylib: mylib.c mylib.h
gcc $(CFLAGS) -fpic -c mylib.c
gcc -shared -o libmylib.so mylib.o
main_linked: main_linked.c mylib.h mylib.c
gcc $(CFLAGS) -L. -lmylib main_linked.c -o main_linked
Note:
mylib is compiled into a shared library with -shared
main_linked is then built with a single gcc call passing -lmylib to specify the library to link and -L. to say where to find it (in this case - current dir)
Check the -L and -l flags to g++.