Static locals in methods and ELF OS ABI

Static locals in methods and ELF OS ABI - c++

Given the program
$ cat main.cpp
#ifndef WITH_LOCAL_STATIC
static int y = 0;
#endif
class X {
public:
void foo() {
#ifdef WITH_LOCAL_STATIC
static int y = 0;
#endif
++y;
}
};
int main() {
X().foo();
return 0;
}
compiled in two different ways:
$ g++ main.cpp -o global
$ g++ main.cpp -DWITH_LOCAL_STATIC -o local
I get two different binary formats:
$ file local
local: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, BuildID[sha1]=0x8d8e998601e44115b5fa6c0e71f3ed97cb13a0bd, not stripped
$ file global
global: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, BuildID[sha1]=0x3481ba7c6969ed9d1bd1c8ce0f052d574023d488, not stripped
Can somebody explain why I get ELFOSABI_LINUX in one case, but ELFOSABI_NONE in the other? Compiler is gcc 4.7.2
The background is that in my environment, the loader rejects executables that are not ELFOSABI_NONE.

The GNU/Linux format superseded the System V object file format, and your gcc believes that the minimum OS/ABI where your executable could be run is GNU/Linux.
That usually happens when your program has symbols whose type is STT_GNU_IFUNC (a GNU extensions that denotes an indirect function), and these symbols are typically coming from glibc. When you introduced the local static variable gcc added (more) code to the translation unit to handle its initialisation and destruction (along the lines of _ZZZ__static_initialization_and_destruction_iii), and this is where the relevant parts of glibc likely came into play.
First of all, your best bet is to follow the advice in this question: How to avoid STT_GNU_IFUNC symbols in your binary?
Second, I have to say that on my box, both the old gcc 4.4 and the new clang 3.4 generate the global and local binaries as SYSV standard ELFs, so either your test case is missing more of the relevant bits and pieces, or you are possibly using a custom configured and built gcc, custom linker, or a non-standard glibc.
More avenues that you can pursue to investigate how did you end up with these GNU indirect functions in your binary:
Run nm on your binary and identify the indirect symbols, they should have the i type. (See below.)
Generate a link map and/or an assembly output and trace these indirect symbols to the specifics of your code.
Additionally check if locating glibc with ldd -v $(type -p gcc) points you to a non-standard libc
i - For PE format files this indicates that the symbol is in a section specific to the implementation of DLLs. For ELF format files this indicates that the symbol is an indirect function. This is a GNU extension to the standard set of ELF symbol types. It indicates a symbol > which if referenced by a relocation does not evaluate to its address, but instead must be invoked at runtime. The runtime execution will then return the value to be used in the relocation.
https://sourceware.org/binutils/docs/binutils/nm.html

Related

What does DT_TEXTREL mean and how to solve? [duplicate]

64 bit Linux uses the small memory model by default, which puts all code and static data below the 2GB address limit. This makes sure that you can use 32-bit absolute addresses. Older versions of gcc use 32-bit absolute addresses for static arrays in order to save an extra instruction for relative address calculation. However, this no longer works. If I try to make a 32-bit absolute address in assembly, I get the linker error:
"relocation R_X86_64_32S against `.data' can not be used when making a shared object; recompile with -fPIC".
This error message is misleading, of course, because I am not making a shared object and -fPIC doesn't help.
What I have found out so far is this: gcc version 4.8.5 uses 32-bit absolute addresses for static arrays, gcc version 6.3.0 doesn't. version 5 probably doesn't either. The linker in binutils 2.24 allows 32-bit absolute addresses, verson 2.28 does not.
The consequence of this change is that old libraries have to be recompiled and legacy assembly code is broken.
Now I want to ask: When was this change made? Is it documented somewhere? And is there a linker option that makes it accept 32-bit absolute addresses?

Your distro configured gcc with --enable-default-pie, so it's making position-independent executables by default, (allowing for ASLR of the executable as well as libraries). Most distros are doing that, these days.
You actually are making a shared object: PIE executables are sort of a hack using a shared object with an entry-point. The dynamic linker already supported this, and ASLR is nice for security, so this was the easiest way to implement ASLR for executables.
32-bit absolute relocation aren't allowed in an ELF shared object; that would stop them from being loaded outside the low 2GiB (for sign-extended 32-bit addresses). 64-bit absolute addresses are allowed, but generally you only want that for jump tables or other static data, not as part of instructions.1
The recompile with -fPIC part of the error message is bogus for hand-written asm; it's written for the case of people compiling with gcc -c and then trying to link with gcc -shared -o foo.so *.o, with a gcc where -fPIE is not the default. The error message should probably change because many people are running into this error when linking hand-written asm.
How to use RIP-relative addressing: basics
Always use RIP-relative addressing for simple cases where there's no downside. See also footnote 1 below and this answer for syntax. Only consider using absolute addressing when it's actually helpful for code-size instead of harmful. e.g. NASM default rel at the top of your file.
AT&T foo(%rip) or in GAS .intel_syntax noprefix use [rip + foo].
Disable PIE mode to make 32-bit absolute addressing work
Use gcc -fno-pie -no-pie to override this back to the old behaviour. -no-pie is the linker option, -fno-pie is the code-gen option. With only -fno-pie, gcc will make code like mov eax, offset .LC0 that doesn't link with the still-enabled -pie.
(clang can have PIE enabled by default, too: use clang -fno-pie -nopie. A July 2017 patch made -no-pie an alias for -nopie, for compat with gcc, but clang4.0.1 doesn't have it.)
Performance cost of PIE for 64-bit (minor) or 32-bit code (major)
With only -no-pie, (but still -fpie) compiler-generated code (from C or C++ sources) will be slightly slower and larger than necessary, but will still be linked into a position-dependent executable which won't benefit from ASLR. "Too much PIE is bad for performance" reports an average slowdown of 3% for x86-64 on SPEC CPU2006 (I don't have a copy of the paper so IDK what hardware that was on :/). But in 32-bit code, the average slowdown is 10%, worst-case 25% (on SPEC CPU2006).
The penalty for PIE executables is mostly for stuff like indexing static arrays, as Agner describes in the question, where using a static address as a 32-bit immediate or as part of a [disp32 + index*4] addressing mode saves instructions and registers vs. a RIP-relative LEA to get an address into a register. Also 5-byte mov r32, imm32 instead of 7-byte lea r64, [rel symbol] for getting a static address into a register is nice for passing the address of a string literal or other static data to a function.
-fPIE still assumes no symbol-interposition for global variables / functions, unlike -fPIC for shared libraries which have to go through the GOT to access globals (which is yet another reason to use static for any variables that can be limited to file scope instead of global). See The sorry state of dynamic libraries on Linux.
Thus -fPIE is much less bad than -fPIC for 64-bit code, but still bad for 32-bit because RIP-relative addressing isn't available. See some examples on the Godbolt compiler explorer. On average, -fPIE has a very small performance / code-size downside in 64-bit code. The worst case for a specific loop might only be a few %. But 32-bit PIE can be much worse.
None of these -f code-gen options make any difference when just linking,
or when assembling .S hand-written asm. gcc -fno-pie -no-pie -O3 main.c nasm_output.o is a case where you want both options.
Checking your GCC config
If your GCC was configured this way, gcc -v |& grep -o -e '[^ ]*pie' prints --enable-default-pie. Support for this config option was added to gcc in early 2015. Ubuntu enabled it in 16.10, and Debian around the same time in gcc 6.2.0-7 (leading to kernel build errors: https://lkml.org/lkml/2016/10/21/904).
Related: Build compressed x86 kernels as PIE was also affected by the changed default.
Why doesn't Linux randomize the address of the executable code segment? is an older question about why it wasn't the default earlier, or was only enabled for a few packages on older Ubuntu before it was enabled across the board.
Note that ld itself didn't change its default. It still works normally (at least on Arch Linux with binutils 2.28). The change is that gcc defaults to passing -pie as a linker option, unless you explicitly use -static or -no-pie.
In a NASM source file, I used a32 mov eax, [abs buf] to get an absolute address. (I was testing if the 6-byte way to encode small absolute addresses (address-size + mov eax,moffs: 67 a1 40 f1 60 00) has an LCP stall on Intel CPUs. It does.)
nasm -felf64 -Worphan-labels -g -Fdwarf testloop.asm &&
ld -o testloop testloop.o # works: static executable
gcc -v -nostdlib testloop.o # doesn't work
...
..../collect2 ... -pie ...
/usr/bin/ld: testloop.o: relocation R_X86_64_32 against `.bss' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status
gcc -v -no-pie -nostdlib testloop.o # works
gcc -v -static -nostdlib testloop.o # also works: -static implies -no-pie
GCC can also make a "static PIE" with -static-pie; ASLRed by no dynamic libraries or ELF interpreter. Not the same thing as -static -pie - those conflict with each other (you get a static non-PIE) although it might possibly get changed.
related: building static / dynamic executables with/without libc, defining _start or main.
Checking if an existing executable is PIE or not
This has also been asked at: How to test whether a Linux binary was compiled as position independent code?
file and readelf say that PIEs are "shared objects", not ELF executables. ELF-type EXEC can't be PIE.
$ gcc -fno-pie -no-pie -O3 hello.c
$ file a.out
a.out: ELF 64-bit LSB executable, ...
$ gcc -O3 hello.c
$ file a.out
a.out: ELF 64-bit LSB shared object, ...
## Or with a more recent version of file:
a.out: ELF 64-bit LSB pie executable, ...
gcc -static-pie is a special thing that GCC doesn't do by default, even with -nostdlib. It shows up as LSB pie executable, dynamically linked with current versions of file. (See What's the difference between "statically linked" and "not a dynamic executable" from Linux ldd?). It has ELF-type DYN, but readelf shows no .interp, and ldd will tell you it's statically linked. GDB starti and /proc/maps confirms that execution starts at the top of its _start, not in an ELF interpreter.
Semi-related (but not really): another recent gcc feature is gcc -fno-plt. Finally calls into shared libraries can be just call [rip + symbol#GOTPCREL] (AT&T call *puts#GOTPCREL(%rip)), with no PLT trampoline.
The NASM version of this is call [rel puts wrt ..got]
as an alternative to call puts wrt ..plt. See Can't call C standard library function on 64-bit Linux from assembly (yasm) code. This works in a PIE or non-PIE, and avoids having the linker build a PLT stub for you.
Some distros have started enabling it. It also avoids needing writeable + executable memory pages so it's good for security against code-injection. (I think modern PLT implementation's don't need that either, just updating a GOT pointer not rewriting a jmp rel32 instruction, so there might not be a security difference.)
It's a significant speedup for programs that make a lot of shared-library calls, e.g. x86-64 clang -O2 -g compiling tramp3d goes from 41.6s to 36.8s on whatever hardware the patch author tested on. (clang is maybe a worst-case scenario for shared library calls, making lots of calls to small LLVM library functions.)
It does require early binding instead of lazy dynamic linking, so it's slower for big programs that exit right away. (e.g. clang --version or compiling hello.c). This slowdown could be reduced with prelink, apparently.
This doesn't remove the GOT overhead for external variables in shared library PIC code, though. (See the godbolt link above).
Footnotes 1
64-bit absolute addresses actually are allowed in Linux ELF shared objects, with text relocations to allow loading at different addresses (ASLR and shared libraries). This allows you to have jump tables in section .rodata, or static const int *foo = &bar; without a runtime initializer.
So mov rdi, qword msg works (NASM/YASM syntax for 10-byte mov r64, imm64, aka AT&T syntax movabs, the only instruction which can use a 64-bit immediate). But that's larger and usually slower than lea rdi, [rel msg], which is what you should use if you decide not to disable -pie. A 64-bit immediate is slower to fetch from the uop cache on Sandybridge-family CPUs, according to Agner Fog's microarch pdf. (Yes, the same person who asked this question. :)
You can use NASM's default rel instead of specifying it in every [rel symbol] addressing mode. See also Mach-O 64-bit format does not support 32-bit absolute addresses. NASM Accessing Array for some more description of avoiding 32-bit absolute addressing. OS X can't use 32-bit addresses at all, so RIP-relative addressing is the best way there, too.
In position-dependent code (-no-pie), you should use mov edi, msg when you want an address in a register; 5-byte mov r32, imm32 is even smaller than RIP-relative LEA, and more execution ports can run it.

I receive different results on UNIX and WIN when use static members with static linking of shared library to executable. Please explain why?

Please consider following peace of code:
// 1. Single header file. Imagine that it is some static library.
// Counter.h
#pragma once
struct Counter
{
Counter()
{
++getCount();
}
static int& getCount()
{
static int counter = 0;
return counter;
}
};
// 2. Shared library (!) :
// main_DLL.cpp
#include <iostream>
#include "counter.h"
extern "C"
{
__declspec(dllexport) // for WIN
void main_DLL()
{
Counter c;
std::cout << "main_DLL : ptr = " << &Counter::getCount()<< " value = " << Counter::getCount() << std::endl;
}
}
// 3. Executable. Shared library statically (!) linked to the executable file.
// main.cpp
#include "counter.h"
#include <iostream>
extern "C"
{
__declspec(dllimport) // for WIN
void main_DLL();
}
int main()
{
main_DLL();
Counter c;
std::cout << "main_EXE : ptr = " << &Counter::getCount() << " value = " << Counter::getCount() << std::endl;
}
Results:
Results for WIN (Win8.1 gcc 5.1.0):
main_DLL : ptr = 0x68783030 value = 1
main_EXE : ptr = 0x403080 value = 1
// conclusion: two different counters
Results for UNIX (Red Hat <I don’t remember version exactly> gcc 4.8.3):
main_DLL : ptr = 0x75693214 value = 1
main_EXE : ptr = 0x75693214 value = 2
// conclusion: the same counter addressed
Building:
Building for WIN:
g++ -c -Wall -Werror -o main_DLL.o main_DLL.cpp
g++ -shared -Wl,--out-implib=libsharedLib.a -o libsharedLib.so main_DLL.o
g++ -Wall –Werror -o simpleExample main.cpp -L./ -lsharedLib
Building for UNIX:
g++ -c -Wall -Werror -fPIC -o main_DLL.o main_DLL.cpp
g++ -shared -fPIC -o libsharedLib.so main_DLL.o
g++ -Wall –Werror -fPIC -o simpleExample main.cpp -L./ -lsharedLib
So, you see that I added –fPIC on UNIX and there is no need to create import library for UNIX, because all exports symbols are included inside shared library. On Windows I use __declspec for it.
For me, results on Windows are pretty much expected. Because shared library and executable are building separately and they should know about static variable in Counter::getCount. They should simply allocate memory for it, that’s why they have different static counters.
I did quite some analysis using tools like nm, objdump. Although I’m not a big expert in them, so I haven’t found anything suspicious. I can provide their output if needed.
Using ldd tool I can see that library linked statically in both cases.
Why I can’t see the same results on Unix for me it’s strange. Could the root cause lie in building options (–fPIC, for example), or I’m missing something?

In windows, A DLL is not exporting global and static symbols unless you add the dllexport statement, therefore, the linker doesn't even know they exists, so it allocate new instance for the static member.
In linux/unix a shared lib is exporting all the global and static symbols, so when the linker find the existence of the static member in the shared lib, it just use its address.
That is the reason for the different result.

EDIT: This is a complete rewrite of the answer. With much more details.
I think that this question deserves more elaborated answer. Especially that there are things that were not mentioned so far.
Dependency Walker
Let me start with referring to the “Dependency Walker” program.
It is a nice program (although these days a bit old-schoolish in its look & feel) that allows analyzing Windows binaries (both EXE and DLL) for symbols that they export/import and their own dependencies to other DLLs. Also it allows showing undecorated symbol names but this seems to be working only with MSVC build binaries. (And some more but that is not important here.)
Thanks to this program crucial information (for this question) can be uncovered. So I encourage you to use it during experiments.
Exporting policy on Linux vs. Windows
SHR already pointed this out but I will mention it also for completeness of the answer. And some extra details.
On Linux every symbol gets exported from a shared library by default. On the other hand on Windows you have to explicitly state which symbols to export from a shared library.
GCC seems however to provide some means of controlling exports in "Windows style". See for example Visibility entry on GCC Wiki.
Also note that there are various ways of exporting on both Linux and Windows. For example both seem to support exporting selectively by providing linker with a list of names for symbols to export. But it also seems that nowadays (on Windows at least) this isn't really used much. __declspec approach seems to be preferred.
What can be exported?
After that general introduction let's now stick to Windows case. Nowadays you export/import symbols from shared libraries by using the __declspec. Just as shown in the question. (Well maybe not exactly that - typically you use a #define to handle bi-directionality as shown in already mentioned Visibility entry on GCC Wiki.)
But the declaration can be applied not only to functions, methods and global variables. It can also be applied to types. For example you can have:
class __declspec(dllexport) Counter { /* ... */ };
Such exporting/importing means in general that all members get exported/imported.
Not so easy!
But it would be too easy, wouldn't it? The complication is that GCC and MSVC handle exporting types differently.
My notes here are based mostly on experiments (checks done using Dependency Walker) so I can be wrong or not precise enough. But I did observe differences in behavior.
In tests I used MSVC 2013 from the Express Edition with update 5. For GCC I used MinGW distro from nuwen.net, version 13.0.
MSVC, when exporting entire type, exports each and every member. Including implicitly defined members (like compiler generated copy constructor). And including inlined functions. Furthermore if inlined function has some static local variables they get exported to (!).
GCC on the other hand seems to be far more restrictive. It doesn't export implicitly defined members. Nor it doesn't export inlined members.
Exporting/Importing inline functions
If instead of exporting entire type you would explicitly export an inlined function then and only then will GCC really export it. But still it will not export static local variables in that function.
Further more if you try to import an inlined function GCC will error. With GCC you cannot define symbols that you are importing. And this happens when you import inlined (and so defined) symbol. So in fact it doesn't make any sense to export inlined functions with GCC.
MSVC allows to import inlined functions. In all cases I checked it didn't seem to actually inline the function but instead called the imported version.
Yet note that because MSVC in case of inlined function exports also its static local variables it would be possible for it to really inline the function (rather than import it) while maintaining a single copy of static local variables. For ordinary programs such behavior is mandated by the Standard (N3337, C++11), in point 7.1.2 ([dcl.fct.spec]) at $4 we can read:
(…) A static local variable in an extern inline function always refers to the same object. (…)
But a program and a shared library are actually more like two programs so they are out of scope for the Standard. Yet MSVC even in that case acts (or better to say: could act) as one would expect from a single program.
Solution
Denis Bakhvalov in a comment provided solution for his own question. The solution is to move getCount function from header to source file and export/import it.
This seems to be the only solution portable between GCC and MSVC. Or to be more precise MSVC allows more solutions to this problem but none of them will work when program is build under GCC.
The variable trick
The above is not entirely true. There is another workaround that will work consistently between GCC and MSVC.
This is to stop using static local variable. Instead make it a global variable (most likely by making it static variable in the class) and export it. This will make the trick as well.
Sadly there is no way (or I don't know any) to directly force exporting/importing static local variables. You have to change them to global variables to do that.
MSVC solutions
With MSVC you have more options.
As mentioned before exporting/importing the inlined function itself (whether directly or through type) will do the job.
Summary
As described above even consistency between GCC and MSVC on Windows only requires care. You have to limit yourself to stay in common subset of allowed solutions.
Keeping the program (source) interoperable between Linux and Windows even if with same compiler (GCC) also requires care.
Luckily there is a common subset for all three environments: GCC on Linux, GCC on Windows and MSVC on Windows. That common subset is described already by mentioned Denis' comment.
So do not inline functions that you intend to export/import. Keep them in sources. And on Windows builds (regardless of compiler) export them explicitly (otherwise you will get linker error anyway since the functions in sources of a shared library will not be available when building program).
Note that this is actually a reasonable approach on its own. Inlining function from shared library doesn't seem wise. It freezes not only the interface but also implementation (of that function). You can no longer change this function freely (and deliver new version of your shared library) since all clients would have to be rebuild since they could have inlined that function. So it is a wise approach by itself not to inline from shared library. And as a bonus it assures that your sources are multi-platform friendly.
Also do have a look into the mentioned Visibility entry on GCC Wiki. It might be reasonable to use that approach (of explicit exports) on Linux as well since it seems cleaner (from design point of view) and more efficient at runtime. While it fits well what you have to do for Windows anyway.

how can I prevent arm-none-eabi compiler generating main symbol

I'm using arm-none-eabi to compile source file.
after compiling and generating elf file. I got the following symbols using nm command
00021da8 T ISR_Init
U main
U malloc
010008b0 D MASTER_AHB_MAP
I'm using gdb to debug, but I have problem with main symbol which is not defined.
gdb generate following error :
Function "main" not defined.
when I change my entry point to main, it works fine.
I'm developing bare metal program, so I didn't define main anywhere in my program.
I linked my program with those following libraries
(GNU_ARM_TOOL)/lib/gcc/arm-none-eabi/4.8.4/armv7-ar/thumb/fpu
(GNU_ARM_TOOL)/arm-none-eabi/lib/armv7-ar/thumb/fpu
for my understanding, the main symbol is generated from one of the above libraries. my question is how can I or how can I avoid the compiler generating the undefined symbol main, or at least delete the undefined main symbol in the final elf file to avoid gdb error.

To avoid gcc generating references to main, link your program with the -nostdlib gcc option:
-nostdlib: Do not use the standard system startup files or libraries when linking. No startup files and only the libraries you specify are passed to the linker, and options specifying linkage of the system libraries, such as -static-libgcc or -shared-libgcc, are ignored.
The compiler may generate calls to memcmp, memset, memcpy and memmove. These entries are usually resolved by entries in libc. These entry points should be supplied through some other mechanism when this option is specified.
One of the standard libraries bypassed by -nostdlib and -nodefaultlibs is libgcc.a, a library of internal subroutines which GCC uses to overcome shortcomings of particular machines, or special needs for some languages. (See Interfacing to GCC Output, for more discussion of libgcc.a.) In most cases, you need libgcc.a even when you want to avoid other standard libraries. In other words, when you specify -nostdlib or -nodefaultlibs you should usually specify -lgcc as well. This ensures that you have no unresolved references to internal GCC library subroutines.
To avoid gcc generating calls to memcmp, memset, memcpy etc compile with gcc's -ffreestanding option. Or use "function attributes" syntax, e.g.:
/* defined in the linker script gcc.ld */
extern int __etext, __data_start__, __data_end__, __bss_start__, __bss_end__;
/* make gcc not translate the data copy loop into a memcpy() call
*
* See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888
* Note that just passing optimize("freestanding", "no-builtin")
* as a function attribute here doesn't work on
* gcc-arm-embedded 2014 (gcc 4.9.3) */
__attribute__((optimize("freestanding", "no-builtin",
"no-tree-loop-distribute-patterns")))
void Reset_Handler()
{
int *src, *dst;
for (src = &__etext, dst = &__data_start__;
dst != &__data_end__;
src++, dst++)
*dst = *src;
for (dst = &__bss_start__; dst < &__bss_end__; dst++)
*dst = 0;
main();
}

Is there a way to "statically" interpose a shared .so (or .o) library into an executable?

First of all, consider the following case.
Below is a program:
// test.cpp
extern "C" void printf(const char*, ...);
int main() {
printf("Hello");
}
Below is a library:
// ext.cpp (the external library)
#include <iostream>
extern "C" void printf(const char* p, ...);
void printf(const char* p, ...) {
std::cout << p << " World!\n";
}
Now I can compile the above program and library in two different ways.
The first way is to compile the program without linking the external library:
$ g++ test.cpp -o test
$ ldd test
linux-gate.so.1 => (0xb76e8000)
libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb7518000)
/lib/ld-linux.so.2 (0xb76e9000)
If I run the above program, it will print:
$ ./test
Hello
The second way is to compile the program with a link to the external library:
$ g++ -shared -fPIC ext.cpp -o libext.so
$ g++ test.cpp -L./ -lext -o test
$ export LD_LIBRARY_PATH=./
$ ldd test
linux-gate.so.1 => (0xb773e000)
libext.so => ./libext.so (0xb7738000)
libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb756b000)
libstdc++.so.6 => /usr/lib/i386-linux-gnu/libstdc++.so.6 (0xb7481000)
/lib/ld-linux.so.2 (0xb773f000)
libm.so.6 => /lib/i386-linux-gnu/libm.so.6 (0xb743e000)
libgcc_s.so.1 => /lib/i386-linux-gnu/libgcc_s.so.1 (0xb7421000)
$ ./test
Hello World!
As you can see, in the first case the program uses printf from libc.so, while in the second case it uses printf from libext.so.
My question is: from the executable obtained as in the first case and the object code of libext (either as .so or .o), is it possible to obtain an executable like in the second case? In other words, is it possible to replace the link to libc.so with a link to libext.so for all symbols defined in the latter?
**Note that interposition via LD_PRELOAD is not what I want. I want to obtain an exectuable which is directly linked to the libraries I need. I underline again that fact the I only have access to the first binary and to the external object I want to "statically" interpose **

It is possible. Learn about shared library interposition:
When a program that uses dynamic libraries is compiled, a list of undefined symbols is included in the binary, along with a list of libraries the program is linked with. There is no correspondence between the symbols and the libraries; the two lists just tell the loader which libraries to load and which symbols need to be resolved. At runtime, each symbol is resolved using the first library that provides it. This means that if we can get a library containing our wrapper functions to load before other libraries, the undefined symbols in the program will be resolved to our wrappers instead of the real functions.

What you ask for is traditionally NOT possible. This has already been discussed here and here.
The crux of your question being -
How to statically link a dynamic shared object?
This cannot be done. The reason being the fact that statically linking a library is effectively the same as taking the compilation results of that library, unpacking them in your current project, and using them as if they were your own objects. *.a files are just archives of a bunch of *.o files with all the info intact within them. On the other hand, dynamic libraries are already linked; the symbol re-location info already being discarded and hence cannot be statically linked into an executable.
However you DO have other alternatives to work around this technical limitation.
So what are your options?
1. Use LD_PRELOAD on target system
Shared library interposition is well described in Maxim's answer.
2. Prepare a pre-linked stand-alone executable
elf-statifier is tool for creating portable, self-contained Linux executables.
It attempts to package together a dynamically-linked executable and all the dynamically-linked libraries of into a single stand-alone executable file. This file can be copied and run on another machine independently.
So now on your development machine, you can set LD_PRELOAD and run the original executable and verify that it works properly. At this point elf-statifier creates a snapshot of the process memory image. This snapshot is saved as an ELF executable, with all the required shared-libraries(incluing your custom libext.so) inside. Hence there is no need to make any modifications (for eg. to LD_PRELOAD) on the target system running the newly generated standalone executable.
However, this approach is not guaranteed to work in all scenarios. This is due to the fact that recent Linux kernels introduced VDSO and ASLR.
A commercial alternative to this is ermine. It can work around VDSO and ASLR limitations.

You are going to have to modify the binary. Take a look at patchelf http://nixos.org/patchelf.html
It will let you set or modify either the RPATH or even the "interpreter" i.e. ld-linux-x86-64.so to something else.
From the description of the utility:
Dynamically linked ELF executables always specify a dynamic linker or
interpreter, which is a program that actually loads the executable
along with all its dynamically linked libraries. (The kernel just
loads the interpreter, not the executable.) For example, on a
Linux/x86 system the ELF interpreter is typically the file
/lib/ld-linux.so.2.
So what you could do is run patchelf on the binary in question (i.e. test) with your own interpreter that then loads your library... This may be difficult, but the source code to ld-linux-so is available...
Option 2 would be to modify the list of libraries yourself. At least patchelf gives you a starting point in that the code iterates over the list of libraries (see DT_NEEDED in the code).
The elf specification documentation does indicate that the order is indeed important:
DT_NEEDED: This element holds the string table offset of a null-terminated
string, giving the name of a needed library. The offset is an index
into the table recorded in the DT_STRTAB entry. See ‘‘Shared Object
Dependencies’’ for more information about these names. The dynamic
array may contain multiple entries with this type. These entries’
relative order is significant, though their relation to entries of
other types is not.
The nature of your question indicates you are familiar with programming :-) Might be a good time to contribute an addition to patchelf... Modifying library dependencies in a binary.
Or maybe your intention is to do exactly what patchelf was created to do... Anyway, hope this helps!

Statifier probably does what you want. It takes an executable and all shared libraries and outputs a static executable.

It's possible. You just need to edit ELF header and add your library in Dynamic section.
You can check contents of "Dynamic section" using readelf -d <executable>. Also readelf -S <executable> will tell you offset of .dynsym and .dynstr. In .dynsym you can find array of Elf32_Dyn or Elf64_Dyn structures where your d_tag should be DT_NEEDED and d_un.d_ptr should point to a string "libext.so" located in .dynstr section.
ELF headers are described in /usr/include/elf.h.

It might be possible to do what you're asking by dynamically loading the library using dlopen(), accessing the symbol for the function as a function pointer using dlsym(), and then invoking it via the function pointer. There's a good example of what to do on this website.
I tailored that example to your example above:
// test.cpp
#include <stdio.h>
typedef void (*printf_t)(const char *p, ...);
int main() {
// Call the standard library printf
printf_t my_printf = &printf;
my_printf("Hello"); // should print "Hello"
// Now dynamically load the "overloaded" printf and call it instead
void* handle = dlopen("./libext.so", RTLD_LAZY);
if (!handle) {
std::cerr << "Cannot open library: " << dlerror() << std::endl;
return 1;
}
// reset errors
dlerror();
my_printf = (printf_t) dlsym(handle, "printf");
const char *dlsym_error = dlerror();
if (dlsym_error) {
std::cerr << "Cannot load symbol 'printf': " << dlsym_error << std::endl;
dlclose(handle);
return 1;
}
my_printf("Hello"); // should print "Hello, world"
// close the library
dlclose(handle);
}
The man page for dlopen and dlsym should provide some more insight. You'll need to try this out, as it is unclear how dlsym will handle the conflicting symbol (in your example, printf) - if it replaces the existing symbol, you may need to "undo" your action later. It really depends on the context of your program, and what you're trying to do overall.

It is possible to change the binary.
For example with a tool like ghex you can change the hexadecimal code of the binary, you search in the code for each instance of libc.so and you replace it by libext.so

Not statically, but you can redirect dynamically loaded symbols in a shared library to your own functions using the elf-hook utility created by Anthony Shoumikhin.
The typical usage is to redirect certain function calls from within a 3rd-party shared library which you can't edit.
Let's say your 3rd party library is located at /tmp/libtest.so, and you want to redirect printf calls made from within the library, but leave calls to printf from other locations unaffected.
Exemplar app:
lib.h
#pragma once
void test();
lib.cpp
#include "lib.h"
#include <cstdio>
void test()
{
printf("hello from libtest");
}
In this example, the above 2 files are compiled into a shared library libtest.so and stored in /tmp
main.cpp
#include <iostream>
#include <dlfcn.h>
#include <elf_hook.h>
#include "lib.h"
int hooked_printf(const char* p, ...)
{
std::cout << p << " [[ captured! ]]\n";
return 0;
}
int main()
{
// load the 3rd party shared library
const char* fn = "/tmp/libtest.so";
void* h = dlopen(fn, RTLD_LAZY);
// redirect printf calls made from within libtest.so
elf_hook(fn, LIBRARY_ADDRESS_BY_HANDLE(h), "printf", (void*)hooked_printf);
printf("hello from my app\n"); // printf in my app is unaffected
test(); // test is the entry point to the 3rd party library
dlclose(h);
return 0;
}
Output
hello from my app
hello from libtest [[ captured! ]]
So as you can see it is possible to interpose your own functions without setting LD_PRELOAD, with the added benefit that you have finer-grained control of which functions are intercepted.
However, the functions are not statically interposed, but rather dynamically redirected
GitHub source for the elf-hook library is here, and a full codeproject article written by Anthony Shoumikhin is here

in gcc how to force symbol resolution at runtime

My first post on this site with huge hope::
I am trying to understand static linking,dynamic linking,shared libraries,static libraries etc, with gcc. Everytime I try to delve into this topic, I have something which I don't quite understand.
Some hands-on work:
bash$ cat main.c
#include "printhello.h"
#include "printbye.h"
void main()
{
PrintHello();
PrintBye();
}
bash$ cat printhello.h
void PrintHello();
bash$ cat printbye.h
void PrintBye();
bash$ cat printbye.c
#include <stdio.h>
void PrintBye()
{
printf("Bye bye\n");
}
bash$ cat printhello.c
#include <stdio.h>
void PrintHello()
{
printf("Hello World\n");
}
gcc -Wall -fPIC -c *.c -I.
gcc -shared -Wl,-soname,libcgreet.so.1 -o libcgreet.so.1.0 *.o
ln -sf libcgreet.so.1.0 libcgreet.so
ln -sf libcgreet.so.1.0 libcgreet.so.1
So I have created a shared library.
Now I want to link this shared library with my main program to create an executable.
gcc -Wall -L. main.c -lcgreet -o greet
It very well works and if I set the LD_LIBRARY_PATH before running greet( or link it with rpath option) I can make it work.
My question is however different:
Since I am anyway using shared library, is it not possible to force symbol resolution at runtime (not sure about the terminology but perhaps called dynamic linking as per the book "Linkers and Loaders"). I understand that we may not want to do it because this makes the program run slow and has overhead everytime we want to run the program, but I am trying to understand this to clear my concepts.
Does gcc linker provide any option to delay symbol resolution at runtime? (to do it with the library we are actually going to run the program with)(as library available at compile time may be different than the one available at runtime if any changes in the library)
I want to be able to do sth like:
bash$ gcc main.c -I.
(what option needed here?)
so that I don't have to give the library name, and just tell it that I want to do symbol resolution at runtime, so headers are good enough for now, actual library names are not needed.
Thanks,
Learner For Ever.

Any linker (gcc, ld or any other) only resolves links at compile-time. That is because the ELF standard (as most others) do not define 'run-time' linkage as you describe. They either link statically (i.e. lib.a) or at start-up time (lib.so, which must be present when the ELF is loaded). However, if you use a dynamic link, the linker will only put in the ELF the name of the file and the symbols it must find, it does not link the file directly. So, if you want to upgrade the lib to a newer version later, you can do so, as long as system can find the same filename (the path can actually be different) and the same symbol names.
The other option, to get symbols at run-time, is to use dlopen, which has nothing to do with gcc or ld. dlopen simply put, opens a dynamic link library, just like fopen might, and returns you a handle, which then you pass to dlsym with the name of the symbol you want, which might be a function name for example. dlsym will then pass you a pointer to that symbol, which you can then use to call the function or use as a variable. This is how plugins are implemented.

I think you are looking for ld option '--unresolved-symbols=ignore-all', yes it can actually do it (ignore prev answer). Imagine scenario where a shared library loaded late (when program is already running), it can use all symbols that are already resolved/loaded by the main process, no need to bother to do it again . btw it does not nervelessly makes it slow , at least on Linux

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js