Does removing relocation data using `-s` affect a position dependent executable - c++

I need to know if using -s in GCC (g++) will have any effects on the PIE. I also want to know its effects on a position-dependent executable. As far as I know, not using any linking option (like -pie and -fpie) results in a non-PIE just like when using -no-pie. Now I have an executable and that's probably non-PIE since I have not specified -pie in the link command. Can -s cause any problems? Will it improve the performance (since the exe will be smaller)?
I also checked this question and in the answer it says:
It seems pretty clear that removing relocation information would interfere with ASLR.
But ASLR only deals with position-independent executables, right? Could removing relocation data from a position-dependent executable interfere with ASLR?

After doing a bit of research, I found some info that might be correct.
From GCC Options for Linking:
-pie
Produce a dynamically linked position independent executable on targets that support it.
-no-pie
Don’t produce a dynamically linked position independent executable.
Looking at this, my guess is that both options produce position-independent executable and the only difference is that the former is dynamically linked but the latter is not dynamically linked (statically linked??). Therefore in both cases, the executable contains relocation data. However, it is still unclear to me whether the generated executable (using -s) interferes with ASLR or not.

Related

Is the address of a C function symbol constant between compiles

I have been experimenting with symbol visibility in my shared library and noticed that the address / value of an exported function symbol does not seem to change. Are these addresses constant between compiles, or is this a coincidence?
The addresses where obtained on a Virtual Machine running Arch Linux using the command readelf with option -W and --dyn-syms.
The reason I'm asking is that I am wondering if the address of a templated C++ function could be used as an uuid for an object type. This is of interest in my serialization routine where I would like to setup an id system which is constant between compiles (object types are registered statically at initialization time, so order is not defined).
If build process is unchanged (i.e. compiler, linker, Makefiles and code remain the same) the static address in ELF file will not change either. But if any component changes, all bets are off.
More importantly, dynamic address (assigned by dynamic loader) will be different on each run due to address-space randomization in modern Linux distros so you should not rely on it.
When you build your code you can choose either to build it position dependent or position independent this has nothing to do with static build (though you can't build a position independent static binary). Position dependent binaries (given the same sources, compiler and build flags) will always generate the same addresses, but as I say further down, I wouldn't rely on it in release.
This is supplied by GCC's options -fPIE (Position independent executable), -fPIC (Position independent code), -pie. ELF executable files can be built as either position dependent or independent but shared objects (libraries) will always be built as position independent as you need to be able to load them in a random location given to you by the OS. From GCC's MAN page:
-fPIC
If supported for the target machine, emit position-independent code, suitable for dynamic linking and avoiding any limit on the size of the global offset table.
-fpie
-fPIE
These options are similar to -fpic and -fPIC, but generated position independent code can be only linked into executables. Usually these options are used when -pie GCC option will be used during linking.
-pie
Produce a position independent executable on targets which support it. For predictable results, you must also specify the same set of options that were used to generate code (-fpie, -fPIE, or model suboptions) when you specify this option.
When loading a PIC shared object you cannot assume it will reside in the same place for each run, as it might be affected by ASLR that is driven by the kernel.
In any way I don't think it's a good practice to use memory addresses as uuids to classes as these might change, even more so if these template classes are implemented as part of a shared object.

Is it possible for gcc/clang to link into executable while stripping all the debug info meanwhile?

I have some huge libraries that are compiled with debug info; when linking them with some small object files I write, it still takes quite a lot of time and the generated executable contains a lot of debug information of the libraries.
So is there an option to tell gcc/clang to discard those debug information inside the library? Will it improve the link speed?
If there is no easy way, should I strip the libraries? I don't think I have the priviledge since the libraries are also used by my partners who need to use the library code for debugging purpose.
As already said in the comments, there's two ways out:
Keep a local copy of said libraries, stripped of debug info.
Link with -Wl,-s or -s, which makes the linker output a stripped executable.

position independent executable (-pie) for arm(cortex-m3)

I'm programming for stm32 (Cortex-m3) with codesourcery g++ lite(based on gcc4.7.2 version). And I want the executables to be loaded dynamically.
I knew I have two options available:
1. relocatable elf, which needs a elf parser.
2. position independent code (PIC) with a global offset register
I prefer PIC with global offset register, because it seems it's easier to implement and I'm not familiar with elf or any elf library. Also, It's easy to generate a .bin file from an elf file with some tools.
I've tried building my program with "-msingle-pic-base -fpic" compiling options and "-pie" linking options, but then I got a linking error:
...path...ld.exe: ...path...thumb2\libstdc++.a(pure.o): relocation
R_ARM_THM_MOVW_ABS_NC against `a local symbol' can not be used when
making a shared object; recompile with -fPIC
I don't quite understand the error message. It seems the default standard c/c++ library can't go with my options and I need to get the source of the library and rebuild for my own purpose.
So,
1. Could anyone provide me any useful information/link on how to work with the position independent executable ?
2. with the -msingle-pic-base option, I don't need to care too much about the GOT and ld script anymore, right?
Note: Without the "-pie" linking option I can build the program. But the program fails when calling a c++ virtual function (when I'm using the IDE(keil)'s simulator to debug my program). I don't understand what's going on and what I've been missing.
----------------------------------------------------------------------
-- added 20130314
with the -msingle-pic-base option, I don't need to care too much about the GOT and ld script anymore, right?
From my experiments, the register (r9 is used in my program) should point to the beginning of the got.plt sections. Delete the "-pie" option, the linking will success, (with r9 properly set) then the c++ virtual function is called successfully. However, I still think the "-pie" option is important, which may ensure that the current standard library is position independent. Could anyone explain this for me?
----------------------------------------------------------------------
-- added 20130315
I took a look at the documents on ABI from ARM's website. But it was of little help because they are not targeting a specific platform. There seems to be a concept of EABI (I'm using sourcery's arm-none-eabi edition), but I couldn't find any documentation on "EABI" from arm's website. I can't neither find documentation on this topic from sourcery and gcc's. There're more than one implementation of PIC, so which one is the sourcery g++ using in the none-eabi case? I think the behaviors of the "-msingle-pic-base", "-fpie", "-pie" options are so poorly documented !
-----------------------------------------------------------------------
From the dis-assembly code, I just figured out that, whit the "-msingle-pic-base", the r9 should point to the base address of the ".got" section, the pointers in the .got sections are absolute pointer and the addressing of variable is similar to the description in the article : Position Independent Code (PIC) in shared libraries. So I still need to modify the ".got" sections on loading. I don't know what is the ".got.plt" section used for in my program. It seems that function calls are using PC-relative addressing.
How to build with the "-pie" or how to link a standard library compiled with "-fpic" is still a problem for me.
The error message tells you to recompile the libstdc++ library, which is most often built, when the gcc compiler is built.
Thus you must recompile your standard libraries (libstdc++, libgcc_*, libc, libm and the all) with -fPIC and link your project against them.
If you rely on prebuilt compiler packages, you're mostly out of the game in the microcontroller world. If you build your compiler yourself (which is, by the way, not too difficult, but an advanced/expert task) you are on the go.
It is also possible to compile your stdandard libraries yourself with the compiler you have. You will need the sources of libraries and figure out, how the compiler package build system builds them and you have to mimic this. Perhaps here are some experts, who can advise you on this way.
There's a nice blog post on this topic, eight years after asking the question initially, but it's there: https://mcuoneclipse.com/2021/06/05/position-independent-code-with-gcc-for-arm-cortex-m/
The general outline is that you have to:
Set up GOT from linker-generated information
Set up PLT from Program Header information
Implement a binder based on the GOT entries
Compile your library as a shared relocatable binary: -msingle-pic-base -mpic-register=r9 -mno-pic-data-is-text-relative -fPIC
Set R9 accordingly

Static linking of Glibc

How can i compile my app linking statically glibc library, but only the code needed for my app? (Not all lib)
Now my compile command:
g++ -o newserver test.cpp ... -lboost_system -lboost_thread -std=c++0x
Thanks!
That's what -static does (as described in another answer): unneeded modules won't get linked into your program. But your expectations on the amount of stuff which is needed (in a sense that we can't convince linker to the contrary) may be too optimistic.
If you trying to do it for portability (running an executable on other machines with older glibc or something like that), there is one easy test question to see if you're going to get what you want:
Did you think of the problem with libnss, and are you sure it is not going to bite you?
If your answer is yes, maybe it makes sense to go on. If the answer is no, or the question seems too obscure and there is no answer, just quit your expirements with statically linked glibc: it has more chance to hurt than help.
Add -static to the compile line. It will only add what your application needs [and of course, any functions the functions you application calls, and any functions those functions call, including a bunch of startup code and some other bits and pieces], so it will be around 800K (for a simple "hello world" program) on an x86 machine. Other architectures vary. Since boost probably also calls the standard library at least a little bit, it's likely that you will have more than 800K added to your appliciation. But it only applies functions used by any of the code in the final binary, not the entire library [about 2MB as a shared library].
If you ONLY want link glibc, you will need to modify the linking line to your compile to:
-Wl,-Bstatic -libc -Wl,-Bdynamic. This will prevent any other library from being linked statically [you sometimes need to have more than one of these statements, as sometimes something pulled in by another library requires "more" from glibc to be pulled in - don't worry, it won't bring in anything more than the linker thinks is necessary].

What does -fPIC mean when building a shared library?

I know the '-fPIC' option has something to do with resolving addresses and independence between individual modules, but I'm not sure what it really means. Can you explain?
PIC stands for Position Independent Code.
To quote man gcc:
If supported for the target machine, emit position-independent code, suitable for dynamic linking and avoiding any limit on the size of the global offset table. This option makes a difference on AArch64, m68k, PowerPC and SPARC.
Use this when building shared objects (*.so) on those mentioned architectures.
The f is the gcc prefix for options that "control the interface conventions used
in code generation"
The PIC stands for "Position Independent Code", it is a specialization of the fpic for m68K and SPARC.
Edit: After reading page 11 of the document referenced by 0x6adb015, and the comment by coryan, I made a few changes:
This option only makes sense for shared libraries and you're telling the OS you're using a Global Offset Table, GOT. This means all your address references are relative to the GOT, and the code can be shared accross multiple processes.
Otherwise, without this option, the loader would have to modify all the offsets itself.
Needless to say, we almost always use -fpic/PIC.
man gcc says:
-fpic
Generate position-independent code (PIC) suitable for use in a shared
library, if supported for the target machine. Such code accesses all
constant addresses through a global offset table (GOT). The dynamic
loader resolves the GOT entries when the program starts (the dynamic
loader is not part of GCC; it is part of the operating system). If
the GOT size for the linked executable exceeds a machine-specific
maximum size, you get an error message from the linker indicating
that -fpic does not work; in that case, recompile with -fPIC instead.
(These maximums are 8k on the SPARC and 32k on the m68k and RS/6000.
The 386 has no such limit.)
Position-independent code requires special support, and therefore
works only on certain machines. For the 386, GCC supports PIC for
System V but not for the Sun 386i. Code generated for the
IBM RS/6000 is always position-independent.
-fPIC
If supported for the target machine, emit position-independent code,
suitable for dynamic linking and avoiding any limit on the size of
the global offset table. This option makes a difference on the m68k
and the SPARC.
Position-independent code requires special support, and therefore
works only on certain machines.