Is -fPIC for shared libraries ONLY? - c++

I know -fPIC is necessary for shared libraries and know why.
However, I am not clear on the question:
Should -fPIC never be used during building an executable or a static library?

Should -fPIC never be used during building an executable or a static library?
Never is a strong word, and the statement above is false.
Code built with -fPIC is (slightly) less optimal, so why would you want to put it into anything other than a shared library?
Let's start with a static library, which has an easy answer.
Suppose you want to give your users a static library that can be linked into either an executable, or into a shared library of their own?
In that case you must either give them 3 separate archive libraries (one built with -fPIC for linking into shared libraries, one built with -fPIE for linking into PIE executables, and a "regular" one), or you can give them a single archive library (which must have code built with -fPIC).
Now, it could be argued that you should instead give them a shared library, but that forces your end users to distribute 2 binaries, and they may prefer to not do that.
But suppose you want to build a regular (non-PIE) executable. What could be the reason to link in -fPIC code into such executable?
Well, suppose you are in the development stage, and don't care that much about optimizing the code yet. Suppose further that you want to test your code as both a shared library, and as part of PIE and non-PIE executable.
Under above conditions, you could either compile your code 3 times (with and without -fPIC, and with -fPIE), or you could compile it once (with -fPIC) and link it into all 3 of shared library, PIE and non-PIE executable. Doing this saves a lot of compilation time, and some build system complexity.
TL;DR: putting -fPIC objects into executables and static libraries has its place, and you should understand the reason why you are doing it (if you end up doing it).
Update:
Code in an object file is always relocatable
Correct.
is it position-independent code?
No: not all relocatable code is position-independent.
Position-independent code is a subset of relocatable code. Relocatable code can have relocations that apply to any section. Position-independent code must not have any relocations against .text (and .rodata).

Related

Misconception about static/implicit linking Vs dynamic/explcit linking

I've recently learnt that static linking and implicit linking are basically the same thing, just different nomenclature. My understanding is that when you statically (implicitely) link to a binary, you are by definition linking against a *.lib (windows) or *.a (linux) file, often using target_link_libraries in cmake. On the other hand when you explicitely link (using LoadLibrary on windows) you are by definition linking to a *.dll file (or *.so on linux) (and there is no corresponding cmake command because all the work is done inside the actual code).
However, in multiple places I've read people referring to statically/implicitely linking to a dll file, which has confused me. Clearly there is a hole in my knowledge somewhere and I was hoping somebody here could plug it.
Edit
Its been pointed out that this question refers mainly to windows, which it does. However, I am currently trying to produce cross platform code so I am still interested on how (or if) these concepts generalise to other platforms.
There are actually 3 different kinds of linking, not 2.
For UNIX:
Link against archive (aka static) library:
gcc main.o libfoo.a
link against dynamic (aka shared) library:
gcc main.o libfoo.so
Link against libdl, which allows you to dlopen arbitrary other shared libraries (which don't need to exist at the time of the link):
gcc main.o -ldl
Both 2 and 3 involve dynamic linker (and are using shared libraries), but to a different extent.
An equivalent exists on Windows: when you link against foo.lib, you are using either 1 or 2, depending on whether foo.lib contains actual code, or refers to foo.dll.
When you use LoadLibrary, you are in case 3.

gcc -fPIC vs. -shared

When compiling a shared library with gcc / g++, why is -fPIC not implied by -shared option? Or, said differently, is the option -fPIC required at link time?
For short, should I write:
gcc -c -fPIC foo.c -o foo.o
gcc -shared -fPIC foo.o -o libfoo.so // with -fPIC
or is the following sufficient:
gcc -c -fPIC foo.c -o foo.o
gcc -shared foo.o -o libfoo.so // without -fPIC
Code that is built into shared libraries should normally but not mandatory be position independent code, so that the shared library can readily be loaded at any address in memory. The -fPIC option ensures that GCC produces such code. It is not required, thus it is not implied in -shared so GCC gives a freedom of choice. Without this option the compiler can make some optimizations on that position dependent code.
Position dependent code may occur an error if one process wants to load more than one shared library at the same virtual address. Since libraries cannot predict what other libraries could be loaded, this problem is unavoidable with the traditional shared library concept. Virtual address space doesn't help here. If your application does not use a lot of other shared libraries and they are loaded before yours, you can predict the loading address of your library and you can define it as a base address of your position dependent library.
The shared library is supposed to be shared between processes, it may not always be possible to load the library at the same address in both. If the code were not position independent, then each process would require its own copy.

Can I bind shared libraries with "gcc -llibnamehere", in addition to static ones?

Two projects:
The loader, a standalone executable (only loads modules)
any module, a shared library (librainbowdash.so) (there can be many modules)
Now, the module is linked with -lpthreads, but I get some weird errors which make me think pthreads is bound as a shared object only, and when the loader loads a module pthreads are not being loaded. (debugging with GDB is impossible, that kind of errors).
I thought the -l switch only allows static libraries? Does it? Doesn't it?
-l specifies library names. It is up to the linker to resolve the library names into static libraries or shared objects to be linked against as appropriate. And it is the loader's job to load any shared libraries used.
If you look in the ld and gcc manpages, it is possible to define 'option groups', I might be a bit rusty, but it should go something like
gcc -o yourprog -Wl,-Bstatic yourprog.c -lstatic_lib -Wl,-Bdynamic -ldynamic_lib
The exact incantation is probably wrong.
From experience, passing the full path of the static library has proven to be much less of a headache than figuring out the exact form of the above mentioned incantation.
That being said, I doubt you would gain much benefit by statically linking pthreads.
I think you might also use
gcc -pthread ...
as well.
Using a simple -static will make the output and all its dependencies static. This is probably not what you want.
It might be that your shared library points to lpthread in a wrong location. Using ldd tool e.g., ldd libfoo.so is often a very effective way to find such linking problems.

Can I build a shared library by linking static libraries?

I have a bunch of static libraries (*.a), and I want to build a shared library (*.so) to link against those static libraries (*.a). How can I do so in gcc/g++?
You can (just extract all the .o files and link them with -shared to make a .so), but whether it works, and how well it works, depends on the platform and whether the static library was compiled as position-independent code (PIC). On some platforms (e.g. x86_64), non-PIC code is not valid in shared libraries and will not work (actually I think the linker will refuse to make the .so). On other platforms, non-PIC code will work in shared libraries, but the in-memory copy of the library is not sharable between different programs using it or even different instances of the same program, so it will result in HUGE memory bloat.
I can't see why you couldn't just build the files of your dynamic library to .o files and link with;
gcc -shared *.o -lstaticlib1 -lstaticlib2 -o mylib.so

Should I create .a or .so when packaging my code as a library?

I have a software library and I used to create .a files, so that people can install them and link against them: g++ foo.o -L/path/to -llibrary
But now I often encounter third-party libraries where only .so files are available (instead of .a), and you just link against them without the -l switch, e.g. g++ foo.o /path/to/liblibrary.so.
What are the differences between these solutions? Should I prefer creating .so files for the users of my library?
Typically, libfoo.a is a static library, and libfoo.so is a shared library. You can use the same -L/-l linker options against either a static or shared. Or you can name the full path to the lib with static or shared. Often libraries are built both static and shared to provide application developers the choice of which they want.
All the code needed from a static lib is part of the final executable. This obviously makes it bigger, but it also means it's self-contained. Once it is compiled, you can run your app without the lib.
Code from a shared lib is not part of the executable. There are just some hooks in place to make the executable aware of the name of the lib it needs. In order to run your app, the shared lib has to be present in the lib search path (e.g. $LD_LIBRARY_PATH).
If you have two apps that share the same code, they can each link against a shared lib to keep the binary size down. If you want to upgrade parts of the app without rebuilding the whole thing, shared libs are good for that too.
Good overview of static, shared dynamic and loadable libraries at
http://www.yolinux.com/TUTORIALS/LibraryArchives-StaticAndDynamic.html
Some features that aren't really called out from comments I've seen so far.
Static linkage (.a/.lib)
Sharing memory between these compilation units is generally ok because they should(?will) all be using the same runtime.
Static linkage means you avoid 'dll hell' but the cost is recompilation to make use of any change at all. static linkage into Shared libraries (.so) can lead to strange results if you have more than 1 such shared library used by the final executable - global variables may exist multiple times and which one is used and when they are initialised can cause an entirely different hell.
The library will be part of the shipped product but obfuscated and not directly usable.
Shared/Dynamic libraries (.so/.dll)
Sharing memory between these compilation units can be hazardous as they may choose to use different runtime. This can mean you provide different Shared/Dynamic libraries based on the debug/release or single/multi threaded or...
Shared libraries (.so) are less prone to 'dll hell' then Dynamic libraries (.dll) as they include options for quite specific versioning.
Compiling against a .so will capture version information internal to the file (hard to fake) so that you get quite specific .so usage. Compiling against the .lib/.dll only gives a basic file name, any versioning is done managed by the developer (using naming or manually loading the library and checking version details by hand)
The library will have to ship with the final product (somebody else can pick it up and use it)
But now I often encounter third-party libraries where only .so files are available [...] and you just link against them without the -l switch, e.g. g++ foo.o /path/to/liblibrary.so.
JFYI, if you link to a shared library which does not have a SONAME set (compare with readelf -a liblibrary.so), you will end up putting the specified path of liblibrary.so into your target object (executable or another shared library), and which is usually undesired, for users have their own ideas of where to put a program and its associated files. The preferred way is to use -L/path/to -llibrary, perhaps together with -Wl,-rpath,/whatever/path/to if this is the final path (such pathing decisions are made by Linux distributions for example).
Should I prefer creating .so files for the users of my library?
If you distribute source code, the user will make the particular choice.