I am getting very confused trying to build a simple C++ library using Android NDK 23 (23.1.7779620). I am using CMake and this is a very simple program:
# CMakeLists.txt
cmake_minimum_required(VERSION 3.14)
project(mf)
add_library(mf lib.cpp)
// lib.hpp
#pragma once
#include <string>
std::string foo(std::string);
// lib.cpp
#include "lib.hpp"
std::string foo(std::string str) {
return std::string{"test"} + str;
}
This is the command line to build:
cmake -G Ninja -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DANDROID_STL=c++_shared -DANDROID_ABI=arm64-v8a -DANDROID_PLATFORM=android-29 -DCMAKE_TOOLCHAIN_FILE=${ANDROID_NDK}/build/cmake/android.toolchain.cmake ..
cmake --build . -v
The first problem is that I was expecting to link against libc++.so and not libc++_shared.so. What is the difference between them? I read this article. But still is not explained the difference betweend libc++ and libc++_shared
The second problem is even worst, it seems I am using libstdc++!
The 3rd point, I was thinking that the c++ implementation of clang was under the namespace std::__1 but I cannot find anything like that.
I know libc++_shared is used because of this command:
$ readelf -d libmf.so
Dynamic section at offset 0x76e0 contains 26 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libm.so]
0x0000000000000001 (NEEDED) Shared library: [libc++_shared.so]
0x0000000000000001 (NEEDED) Shared library: [libdl.so]
0x0000000000000001 (NEEDED) Shared library: [libc.so]
0x000000000000000e (SONAME) Library soname: [libmf.so]
Running nm it seem I am using symbols from libstdc++:
$ nm -gDC libmf.so | grep '__ndk1'
0000000000003af0 T foo(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> >)
U std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> >::append(char const*, unsigned long)
$ nm -gDC libmf.so | grep '__1'
$
Update
In this post is explained the difference between libc++.so and libc++_shared.so
By passing -DANDROID_STL=c++_shared to the CMake invocation you explicitly asked for the shared runtime as opposed to the default runtime.
As explained in the documentation, the rules are simple:
if all your native code is in a single library, use the static libc++ (the default) such that unused code can be removed and you have the smallest possible application package.
As soon as you include an extra library – either because you include a precompiled library from somewhere else or you include an Android AAR file that happens to include native code – you must switch to the shared runtime.
The rationale for the rules is simple: the C++ runtime has certain global data structures that must be initialized once and must only exist once in memory. If you were accidentally to load two libraries that both link the C++ runtime statically, you have (for instance) two conflicting memory allocators.
This will result in crashes when you free or delete memory allocated by the other library, or if you pass a C++ STL object like std::string across library boundaries.
For completeness, in older NDKs libstdc++ (the GNU C++ runtime) was also included in the NDK, but as of NDK r18 that is no longer the case.
Related
I have an executable that uses some shared objects.
These shared objects have other shared objects as dependencies, and I want to set my main executable's rpath to include the directories for those dependencies, since runpath is not used for indirect dependencies.
I'm trying to bake rpath into my ELF, however when using this:
gcc -std=c++20 -o main main.cpp -lstdc++ -L./lib -Wl,-rpath,./lib
The result is that ./lib is set in the ELF as RUNPATH and not RPATH:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libstdc++.so.6]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
0x000000000000001d (RUNPATH) Library runpath: [./lib]
0x000000000000000c (INIT) 0x1000
...
Can someone explain why this happens? I was expecting ./lib to be defined in the RPATH section and not RUNPATH. Seems like RPATH section does not exist at all.
I am using gcc version 11.1.0, with ld version 2.34.
I know this might not be the best solution for managing indirect dependencies and I'd be happy to hear a better one, however I still wonder why -Wl,-rpath,./lib results in RUNPATH defined in the ELF, and not RPATH.
To get a DT_RUNPATH entry you need --enable-new-dtags.
To get a DT_RPATH entry (which is deprecated) you need --disable-new-dtags.
In your case, something like this:
gcc -std=c++20 -o main main.cpp -lstdc++ -L./lib -Wl,--disable-new-dtags,-rpath,./lib
I'll suggest to use an absolute path with rpath, I'm not sure from which directory relative paths are interpreted. There is also $ORIGIN if you want to use the executable as reference point.
See https://man7.org/linux/man-pages/man1/ld.1.html and https://man7.org/linux/man-pages/man8/ld.so.8.html for more informations.
There is library which is compiled against -lyaml. But libyaml.so is not getting listed as dependency by ldd. Build is happening successfully using autoconf tool chain.
$ nm libxxxx.so | grep -i yaml
U yaml_document_delete
U yaml_document_get_node
U yaml_parser_delete
U yaml_parser_initialize
U yaml_parser_load
U yaml_parser_set_input_file
$ readelf -d libxxxx.so
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
0x000000000000000e (SONAME) Library soname: [libxxxx.so.0]
There is another shared library which depends depend upon libxxxx.so.
$ ldd lib/libxxxx1.so
libzmq.so.5 => /usr/lib/x86_64-linux-gnu/libzmq.so.5 (0x00007fd45e072000)
libxmaapi.so.0 =>
When I am linking my executable with libxxxx1.so, it is giving undefined symbols error. The question is how do I link against library not found in dependency tree?
This question provides approaches to ignore the problem.
Linking with dynamic library with dependencies
The one approach which I found is disabling optimization using the gcc flag -Wl,--no-as-needed. Since I am already linking using -lyaml, symbols are getting resolved. It works but not efficient.
I'm testing a somewhat non-conventional project layout and rake as make utility. There is a rule to compile binaries from source files in different directories and link them with a shared library. This rule is run from the root directory of the project. For instance the rule does this:
clang -I libs/ -o tests/sourcefile2 tests/sourcefile2.c shared_libs/libFoo.so
And as a result I get the full path shared_libs/libFoo.so in the binary:
readelf -d tests/sourcefile2
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [shared_libs/libFoo.so]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
...
I would like to change it to just 'libFoo.so' like this:
readelf -d tests/sourcefile2
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libFoo.so]
...
Then I could set RPATH for dynamic linker as I want and it would give some flexibility. But I cannot find the corresponding option or similar example. Could you suggest how to handle this? Should I just use a temporary directory for the build, copy everything and compile there?
Not sure if it will help you. But when I try compile some shit and I don't know what flags. I use pkg-config.
For example, to compile a program which uses Xlib
pkg-config -cflags -libs x11
and the output is the following
-I/usr/X11R7/include -D_REENTRANT -Wl,-rpath,/usr/X11R7/lib -L/usr/X11R7/lib -lX11
Note this vary on systems, for example NetBSD forces me to link it with rpath, and there are optional arguments in this output.
So I copy the output of pkg-config and it compiles.
if If you use 'ld' as your linker you should be able to use "-Wl,-soname ".
I created a cpp project, which used a lib file named: libblpapi3_64.so
This file comes from a library which I download it from Internet.
My project runs without any error. So I update it to bitbucket.
Then my colleague downloads it and runs it at his own computer. But he gets an error:
usr/bin/ld: cannot find -lblpapi3_64.
In fact, I have copied it into my project repository. I mean I created a file named lib under my project and all lib files that I used are in it.
There are also other lib files such as liblog4cpp.a, but they are all good. Only the libblpapi3_64.so gets the error.
Is it because it's a .so file not .a file? Or there is other reason?
Btw, the file name of libblpapi3_64.so is green and others files(.a) is white. I think it's not a link file, it's the original file.
Briefly:
ld does not know about where your project libs are located. You have to place it into ld's known directories or specify the full path of your library by -L parameter to the linker.
To be able to build your program you need to have your library in /bin/ld search paths and your colleague too. Why? See detailed answer.
Detailed:
At first, we should understand what tools do what:
The compiler produces simple object files with unresolved symbols (it does not care about symbols so much at it's running time).
The linker combines a number of object and archive files, relocates their data and ties up symbol references into a single file: an executable or a library.
Let's start with some example. For example, you have a project which consists of 3 files: main.c, func.h and func.c.
main.c
#include "func.h"
int main() {
func();
return 0;
}
func.h
void func();
func.c
#include "func.h"
void func() { }
So, when you compile your source code (main.c) into an object file (main.o) it can't be run yet because it has unresolved symbols. Let's start from the beginning of producing an executable workflow (without details):
The preprocessor after its job produces the following main.c.preprocessed:
void func();
int main() {
func();
return 0;
}
and the following func.c.preprocessed:
void func();
void func() { }
As you may see in main.c.preprocessed, there are no connections to your func.c file and to the void func()'s implementation, the compiler simply does not know about it, it compiles all the source files separately. So, to be able to compile this project you have to compile both source files by using something like cc -c main.c -o main.o and cc -c func.c -o func.o, this will produce 2 object files, main.o and func.o. func.o has all it's symbols resolved because it has only one function which body is written right inside the func.c but main.o does not have func symbol resolved yet because it does not know where it is implemented.
Let's look what is inside func.o:
$ nm func.o
0000000000000000 T func
Simply, it contains a symbol which is in text code section so this is our func function.
And let's look inside main.o:
$ nm main.o
U func
0000000000000000 T main
Our main.o has an implemented and resolved static function main and we are able to see it in the object file. But we also see func symbol which marked as unresolved U, and thus we are unable to see its address offset.
For fixing that problem, we have to use the linker. It will take all the object files and resolve all these symbols (void func(); in our example). If the linker somehow is unable to do that it throws a error like unresolved external symbol: void func(). This may happen if you don't give the func.o object file to the linker. So, let's give all the object files we have to the linker:
ld main.o func.o -o test
The linker will go through main.o, then through func.o, try to resolve symbols and if it goes okay - put it's output to the test file. If we look at the produced output we will see all symbols are resolved:
$ nm test
0000000000601000 R __bss_start
0000000000601000 R _edata
0000000000601000 R _end
00000000004000b0 T func
00000000004000b7 T main
Here our job is done. Let's look the situation with dynamic(shared) libraries. Let's make a shared library from our func.c source file:
gcc -c func.c -o func.o
gcc -shared -fPIC -Wl,-soname,libfunc.so.1 -o libfunc.so.1.5.0 func.o
Voila, we have it. Now, let's put it into known dynamic linker library path, /usr/lib/:
sudo mv libfunc.so.1.5.0 /usr/lib/ # to make program be able to run
sudo ln -s libfunc.so.1.5.0 /usr/lib/libfunc.so.1 #creating symlink for the program to run
sudo ln -s libfunc.so.1 /usr/lib/libfunc.so # to make compilation possible
And let's make our project depend on that shared library by leaving func() symbol unresolved after compilation and static linkage process, creating an executable and linking it (dynamically) to our shared library (libfunc):
cc main.c -lfunc
Now if we look for the symbol in its symbols table we still have our symbol unresolved:
$ nm a.out | grep fun
U func
But this is not a problem anymore because func symbol will be resolved by dynamic loader before each program start. Okay, now let's back to the theory.
Libraries, in fact, are just the object files which are placed into a single archive by using ar tool with a single symbols table which is created by ranlib tool.
Compiler, when compiling object files, does not resolve symbols. These symbols will be replaced to addresses by a linker. So resolving symbols can be done by two things: the linker and dynamic loader:
The linker: ld, does 2 jobs:
a) For static libs or simple object files, this linker changes external symbols in the object files to the addresses of the real entities. For example, if we use C++ name mangling linker will change _ZNK3MapI10StringName3RefI8GDScriptE10ComparatorIS0_E16DefaultAllocatorE3hasERKS0_ to 0x07f4123f0.
b) For dynamic libs it only checks if the symbols can be resolved (you try to link with correct library) at all but does not replace the symbols by address. If symbols can't be resolved (for example they are not implemented in the shared library you are linking to) - it throws undefined reference to error and breaks up the building process because you try to use these symbols but linker can't find such symbol in it's object files which it is processing at this time. Otherwise, this linker adds some information to the ELF executable which is:
i. .interp section - request for an interpreter - dynamic loader to be called before executing, so this section just contains a path to the dynamic loader. If you look at your executable which depends on shared library (libfunc) for example you will see the interp section $ readelf -l a.out:
INTERP 0x0000000000000238 0x0000000000400238 0x0000000000400238
0x000000000000001c 0x000000000000001c R 1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
ii. .dynamic section - a list of shared libraries which interpreter will be looking for before executing. You may see them by ldd or readelf:
$ ldd a.out
linux-vdso.so.1 => (0x00007ffd577dc000)
libfunc.so.1 => /usr/lib/libfunc.so.1 (0x00007fc629eca000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fefe148a000)
/lib64/ld-linux-x86-64.so.2 (0x000055747925e000)
$ readelf -d a.out
Dynamic section at offset 0xe18 contains 25 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libfunc.so.1]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
Note that ldd also finds all the libraries in your filesystem while readelf only shows what libraries does your program need. So, all of these libraries will be searched by dynamic loader (next paragraph).
The linker works at build time.
Dynamic loader: ld.so or ld-linux. It finds and loads all the shared libraries needed by a program (if they were not loaded before), resolves the symbols by replacing them to real addresses right before the start of the program, prepares the program to run, and then runs it. It works after the build and before running the program. Less speaking, dynamic linking means resolving symbols in your executable before each program start.
Actually, when you run an ELF executable with .interp section (it needs to load some shared libraries) the OS (Linux) runs an interpreter at first but not your program. Otherwise you have an undefined behavior - you have symbols in your program but they are not defined by addresses which usually means that the program will be unable to work properly.
You may also run dynamic loader by yourself but it is unnecessary (binary is /lib/ld-linux.so.2 for 32-bit architecture elf and /lib64/ld-linux-x86-64.so.2 for 64-bit architecture elf).
Why does the linker claim that /usr/bin/ld: cannot find -lblpapi3_64 in your case? Because it tries to find all the libraries in it's known paths. Why does it search the library if it will be loaded during runtime? Because it needs to check if all the needed symbols can be resolved by this library and to put it's name into the .dynamic section for dynamic loader. Actually, the .interp section exists in almost every c/c++ elf because the libc and libstdc++ libraries are both shared, and compiler by default links any project dynamically to them. You may link them statically as well but this will enlarge the total executable size. So, if the shared library can't be found your symbols will remain unresolved and you will be UNABLE to run your application, thus it can't produce an executable. You may get the list of directories where libraries are usually searched by:
Passing a command to the linker in compiler arguments.
By parsing ld --verbose's output.
By parsing ldconfig's output.
Some of these methods are explained here.
Dynamic loader tries to find all the libraries by using:
DT_RPATH dynamic section of an ELF file.
DT_RUNPATH section of the executable.
LD_LIBRARY_PATH environment variable.
/etc/ld.so.cache - own cache file which contains a compiled list of candidate libraries previously found in the augmented library path.
Default paths: In the default path /lib, and then /usr/lib. If the binary was linked with -z nodeflib linker option, this step is skipped.
ld-linux search algorithm
Also, note please, that if we are talking about shared libraries, they are not named .so but in .so.version format instead. When you build your application the linker will look for .so file (which is usually a symlink to .so.version) but when you run your application the dynamic loader looks for .so.version file instead. For example, let's say we have a library test which version is 1.1.1 according to semver. In the filesystem it will look like:
/usr/lib/libtest.so -> /usr/lib/libtest.so.1.1.1
/usr/lib/libtest.so.1 -> /usr/lib/libtest.so.1.1.1
/usr/lib/libtest.so.1.1 -> /usr/lib/libtest.so.1.1.1
/usr/lib/libtest.so.1.1.1
So, to be able to compile you must have all of versioned files (libtest.so.1, libtest.so.1.1 and libtest.so.1.1.1) and a libtest.so file but for running your app you must have only 3 versioned library files listed first. This also explains why do Debian or rpm packages have devel-packages separately: normal one (which consists only of the files needed by already compiled applications for running them) which has 3 versioned library files and a devel package which has only symlink file for making it possible to compile the project.
Resume
After all of that:
You, your colleague and EACH user of your application code must have all the libraries in their system linker paths to be able to compile (build your application). Otherwise, they have to change Makefile (or compile command) to add the shared library location directory by adding -L<somePathToTheSharedLibrary> as argument.
After successful build you also need your library again to be able to run the program. Your library will be searched by dynamic loader (ld-linux) so it needs to be in it's paths (see above) or in system linker paths. In most of linux program distributions, for example, games from steam, there is a shell-script which sets the LD_LIBRARY_PATH variable which points to all shared libraries needed by the game.
You could look at our Rblapi package which uses this very library too.
Your basic question of "how do I make a library visible" really has two answers:
Use ld.so. The easiest way is to copy blpapi3_64.so to /usr/local/lib. If you then call ldconfig to update the cache you should be all set. You can test this via ldconfig -p | grep blpapi which should show it.
Use an rpath instruction when building your application; this basically encodes the path and makes you independent of ld.so.
Apart from a longer compile time, is there any downside to linking against an unused library?
for example, is there any difference in the executable of a program that is compiled one of two ways:
g++ -o main main.cpp
g++ -o main main.cpp -llib1 -llib2 -llib3 -lmore
*no library files were actually needed to build main.
I believe it makes no difference because the file sizes are the same, but I'm asking for confirmation.
It depends.
If liblib1.a, liblib2.a, and liblib3.a are static libraries, and no symbols are used from them, then there will be no difference.
If liblib1.so, liblib2.so, or liblib3.so are shared libraries, then they will be loaded at runtime whether or not they are used. You can use the linker flag --as-needed to change this behavior, and this flag is recommended.
To check which shared libraries your binary directly loads at runtime, on an ELF system you can use readelf.
$ cat main.c
int main()
{
return 0;
}
$ gcc main.c
$ readelf -d a.out | grep NEEDED
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
$ gcc -lpng main.c
$ readelf -d a.out | grep NEEDED
0x0000000000000001 (NEEDED) Shared library: [libpng12.so.0]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
You can see that on my system, -lpng links against libpng12.so.0, whether or not symbols from it are actually used. The --as-needed linker flag fixes this:
$ gcc -Wl,--as-needed -lpng main.c
$ readelf -d a.out | grep NEEDED
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
Notes
The --as-needed flag must be specified before the libraries. It only affects libraries which appear after it. So gcc -lpng -Wl,--as-needed doesn't work.
The ldd command lists not only the libraries your binary directly links against, but also all the indirect dependencies. This can change depending on how those libraries were compiled. Only readelf will show you your direct dependencies, and only ldd will show you indirect dependencies.
It depends whether you are linking static libraries or shared libraries. If you are linking static libraries then the executable size would increase with each addition. Linking to shared libraries, doesn't not increases executable size greatly, only library symbols are added.
Absolutely yes. The downside is that others (or you in the future) will assume the libraries are needed for some reason. Most people will not take the time to pare down a program's dependencies, and so the list of them grows and grows.
The cost has nothing to do with the compiled code, but everything to do with maintaining and porting programs.
There are some really good answers above. A further note would be "what difference does it really make". Already mentioned is the cost of maintenance (e.g. problems when someone installs a fresh operating system, which doesn't have Lib3, so the user has to go find lib3 somewhere and install it, and because lib3 also needs lib17 which you also isn't installed, it adds more work for the user).
But also, when you load the binary, if you have linked against shared libraries that aren't actually used, the system will still go look for those libraries, and refuse to load if they are not present - this adds time, and install nightmare.
Once the code is loaded, it should have no additional runtime penalty.
Having said that, there are sometimes arguments for linking against unused libraries. Say your code has an option USE_FOO, where the FOO feature is only included based on some arbitrary choice when building (e.g. "is this on Linux kernel > 3.0" or "Does the system have a fancy graphics card"), and FOO uses Lib1 to do it's business, it can make the build system (makefile or similar) a little simpler to always link against lib1, even if you don't actually need it when USE_FOO is not set.
But in general, don't link against libraries not needed. It causes more dependencies, and that's never a good thing.