C/C++ Static vs dynamic libraries example

C/C++ Static vs dynamic libraries example - c++

I'm learning about static and dynamic libraries. So far I understand why I would need a dynamic library. In case something is changing it's good to plug in a newer version and all the applications will update automatically without even noticing.
a) Great for plugins,
b) multiple apps using the same library and
c) maintenance when you need to correct errors.
However, why would anyone use a static library? I mean what's the advantage? Does sb have an example so I can understand it better? Is it to make a product proprietary?
EDIT: Due to the confusion in the comments. I understand what a static library is, and I also know the difference between a dynamic library. It was just beyond me why anyone would use a static library instead of just the source itself. I think I'm now starting to understand that a static library offers the following advantages:
a) better code maintenance
b) faster compiling times

There is another difference between static and dynamic libraries which may become important in some situations, I am surprised that nobody mentions that.
When static library is linked, the symbols (e.g. function names) are resolved during the linkage (compile) time, so a call to a library function is resolved to the direct call to an address in the final executable.
With dynamic library, this happens during the run-time, when the library is loaded into the process space (often during process start-up). The symbols must be mapped into the process's address space. Depending on the number of symbols, which can be surprisingly large, and number of libraries loaded at the start-up, the delay can be quite tangible.
There is this excellent in-depth guide on dynamic libraries on Linux - How To Write Shared Libraries. It is way too detailed for most of us, but even skimming through it gives you many surprising insights. For instance, it says that in release 1.0 of OpenOffice it had to do more than 1.5 million of string comparisons during the launch!
A way to get a feeling of that is to set LD_DEBUG to symbols, and LD_DEBUG_OUTPUT to some file, run a program and look at the file to see the activity that goes on on startup.

Compilers can do all sorts of additional optimizations with static libraries that they cannot do with dynamic libraries. For example, a compiler can delete unused functions from static libraries. It wouldn't know to do that in a dynamic library. But there are even more advanced optimizations. The compiler can pull code from a static library function into the main program, which will eliminate function calls. Very smart compilers can do even more. The sky is really the limit with static libraries, but dynamic libraries make much of this much harder or impossible.
Probably the more practical reason however, is that static linking is the default for most library compilers so many people just end up using it. To create a dynamic library, you normally have to create an additional file which exposes certain functions. Although the file tends to be relatively simple, if you don't take the time to do it, then your libraries all end up being static.
As mentioned in another post, managing dependencies with static libraries tends to be easier simply because you have everything under your control. You may not know what dll/so is installed on the user's system.

A static library is basically a ZIP of object files. The only advantage it has over just distributing the object files is that it's a single file that encompasses your whole library. Users can use your headers and the lib to build their applications.
Because it's just a ZIP of object files, anything the compiler does to object files also works with static libraries, for example, dead code elimination and whole program optimization (also called Link-Time Code Generation). The compiler won't include bits of the shared library in the final program that are unused, unlike dynamic libraries.
With some build systems, it makes link seams easier too. E.g. for MSVC++, I'll often have a "production" EXE project, a "testing" EXE project, and put the common stuff in a static library. That way, I don't have to rebuild all the common stuff when I do builds.

1.) Shared libraries require position independent code (-fpic or -fPIC for libraries) and position independent code requires setup. This makes for larger code; for example:
*Part of this is due to compiler inefficiencies as discussed here
long realfoo(long, long);
long foo(long x, long y){
return realfoo(x,y);
}
//static
foo:
jmp realfoo #
//shared (-fpic code)
foo:
pushl %ebx #
call __x86.get_pc_thunk.bx #
addl $_GLOBAL_OFFSET_TABLE_, %ebx # tmp87,
subl $16, %esp #,
pushl 28(%esp) # y
pushl 28(%esp) # x
call realfoo#PLT #
addl $24, %esp #,
popl %ebx #
ret
__x86.get_pc_thunk.bx:
movl (%esp), %ebx #,
ret
Using the previous example, the realfoo() could be considered for inlining in a static build if proper optimization is enabled and supported. This is because the compiler has direct access to the object in the archive file (libfoo.a). You can think of the .a file as a pseudo-directory containing the individual object files.
The "not having to recompile the whole binary for a bug in a library" cuts both ways. If your binary doesn't use the offending bug, it was probably compiled out of the static binary and replacing a shared library with code from the latest trunk with the fix may introduce (multiple) other as yet unreported bugs.
Initial startup time. Although many shared library proponents will suggest that using shared libraries will reduce startup times due to other programs already using them and (sometimes) smaller binary size. In practice, this is rarely true with the exception of some really basic X11 apps. Anecdotally, my startup time for X went down to about 1/10th of a second from over 5s by switching to a static build with musl-libc and tinyX11 from the stock glibc+X11 shared. In practice, most static binaries end up starting faster because they don't need to initialize all the (possibly unused) symbols in every dependent library. Furthermore, subsequent calls to the same binary get the same preloading benefits as the shared libraries.
Use a dynamic library when the library is bad for static builds. For example gnu-libc (aka glibc) is notoriously bad at static builds with a basic hello world weighing in at close to 1Mb while musl-libc, diet-libc and uclibc all build hello world at around 10kb. On top of that glibc will omit some key (mostly network related) functionality in static builds.
Use a shared library if the library relies on "plugins" for key functionality; gtk icon loading and a few other features for instance used to be buildable with the plugins builtin, but for the longest time that has been broken, so the only recourse is to use shared libraries. Some C libraries won't support loading arbitrary libraries (musl for example) unless they are built as a shared library, so if you need to load a library on the fly as a "plugin" or "module", you may at least need the C library to be shared. A workaround for static builds is to use function pointers instead and a better, more stable GUI toolkit for the specific case of GTK. If you are building an extendable programming language like perl or python, you're going to need shared library capabilities for your optimized plugins to be written in a compiled language.
Use a shared library if you absolutely need to use a library with a license that is incompatible with static building. (AGPL, GPL, LGPL without a static link clause) ... and of course if/when you don't have the source code.
Use static builds when you want more aggressive optimization. Many well-seasoned libraries like libX11 were written when cdecl was the primary calling convention. Consequently many X11 functions take a boatload of parameter in the same order as the struct that the function manipulates (rather than just a pointer to the struct)... which makes sense with the cdecl calling convention since theoretically you could just move the stack pointer and call the function. However this breaks down as soon as you use some number of registers for parameter passing. With static builds, some of these unintentional consequences can be mitigated via inlining and link time optimization. This is just one example. There are many many other optimizations that crop up as function boundaries are erased.
Memory security can go either way. With static binaries, you aren't susceptible to LD_PRELOAD attacks (a static binary may only have 1 symbol - the entry point), but shared (and static-pie if supported) can have address space randomization that is only available to static builds on the few architectures that support position independent executables (though most common architecture do support PIE now).
Shared libraries can (sometimes) produce smaller binaries, so if you are using shared libraries that will be on the system anyhow, you can reduce package size (and thus server loads) a bit. However if you are going to have to ship the shared libraries anyhow, the combined size of the libraries and the binary will always be larger than the static binary barring some crazy aggressive inlining. A good compromise is to use shared common system libraries (for instance libc, X11, zlib, png, glib and gtk/qt/other-default-toolkit) while using static libraries for uncommon dependencies or libraries specific to your package. The chrome browser does this to some extent. A good way to determine the shared/static threshold in your target Linux distro is to iterate over */bin and */sbin in the default install with ldd or objdump -x and parse the output through sort and uniq to determine a good cutoff. If you are distributing multiple binaries that use most of the same libraries a bunch of static builds will increase bloat, but you can consider using a multicall binary (such as in busybox, toybox, mupdf or netpbm)
Use static builds if you want to avoid "DLL hell" or for cross distro compatibility. A static binary built in one distro should work on any other distro that is reasonably up to date (mostly kernel versions to prevent trying to use unsupported syscalls).
For more info on the advantages of static linking see: stali (short for static linux)
For some shared library propaganda from the former maintainer of glibc, see Ulrich Drepper's "Static Linking Considered Harmful" Though roughly half of the problems with static linking that he mentions are glibc-specific problems (diet, musl and uclibc don't have the same problems).
If you think static libraries don't quite go far enough toward enabling full optimizations, check out Sean Barrett's list of single file libraries. These can often be configured to enable all functions to be static so that the binary can be built as a single compilation unit. This enables several optimizations that can't even be achieved with link time optimization, but you better have an IDE that supports code folding.

Static libraries good then you want tiny package without any problems with conflicting dll's.
Also time to load and initialize libraries reduced a lot with static linking.
But as disadvantage you can notice some size increase of binary.
Static librarys on WIKI

If you need a number of small programs, which use only a very small, but sometimes slightly different part of a huge library (it usually happens with large open-source libraries), it might be better to not build a large number of small dynamic libraries, as they will become hard to manage. In this case, it might be a good idea to statically link only the parts you need.

With dynamic libraries if all the libraries aren't present then the application doesn't run. So if you have a partition that holds your libraries and it becomes unavailable then so does the application. If an application has static libraries then the libraries are always present so there is nothing to prevent the application from working. This generally helps in case you are staring your system in maintenance mode.
On a Solaris system for example, commands that you may need to run in the event that some partitions may not be present are stored under /sbin. sbin is short for static binaries. If a partition is unavailable these apps will still work.

Related

Understanding static & dynamic library linking [duplicate]

Are there any compelling performance reasons to choose static linking over dynamic linking or vice versa in certain situations? I've heard or read the following, but I don't know enough on the subject to vouch for its veracity.
1) The difference in runtime performance between static linking and dynamic linking is usually negligible.
2) (1) is not true if using a profiling compiler that uses profile data to optimize program hotpaths because with static linking, the compiler can optimize both your code and the library code. With dynamic linking only your code can be optimized. If most of the time is spent running library code, this can make a big difference. Otherwise, (1) still applies.

Dynamic linking can reduce total resource consumption (if more than one process shares the same library (including the version in "the same", of course)). I believe this is the argument that drives its presence in most environments. Here "resources" include disk space, RAM, and cache space. Of course, if your dynamic linker is insufficiently flexible there is a risk of DLL hell.
Dynamic linking means that bug fixes and upgrades to libraries propagate to improve your product without requiring you to ship anything.
Plugins always call for dynamic linking.
Static linking, means that you can know the code will run in very limited environments (early in the boot process, or in rescue mode).
Static linking can make binaries easier to distribute to diverse user environments (at the cost of sending a larger and more resource-hungry program).
Static linking may allow slightly faster startup times, but this depends to some degree on both the size and complexity of your program and on the details of the OS's loading strategy.
Some edits to include the very relevant suggestions in the comments and in other answers. I'd like to note that the way you break on this depends a lot on what environment you plan to run in. Minimal embedded systems may not have enough resources to support dynamic linking. Slightly larger small systems may well support dynamic linking because their memory is small enough to make the RAM savings from dynamic linking very attractive. Full-blown consumer PCs have, as Mark notes, enormous resources, and you can probably let the convenience issues drive your thinking on this matter.
To address the performance and efficiency issues: it depends.
Classically, dynamic libraries require some kind of glue layer which often means double dispatch or an extra layer of indirection in function addressing and can cost a little speed (but is the function calling time actually a big part of your running time???).
However, if you are running multiple processes which all call the same library a lot, you can end up saving cache lines (and thus winning on running performance) when using dynamic linking relative to using static linking. (Unless modern OS's are smart enough to notice identical segments in statically linked binaries. Seems hard, does anyone know?)
Another issue: loading time. You pay loading costs at some point. When you pay this cost depends on how the OS works as well as what linking you use. Maybe you'd rather put off paying it until you know you need it.
Note that static-vs-dynamic linking is traditionally not an optimization issue, because they both involve separate compilation down to object files. However, this is not required: a compiler can in principle, "compile" "static libraries" to a digested AST form initially, and "link" them by adding those ASTs to the ones generated for the main code, thus empowering global optimization. None of the systems I use do this, so I can't comment on how well it works.
The way to answer performance questions is always by testing (and use a test environment as much like the deployment environment as possible).

1) is based on the fact that calling a DLL function is always using an extra indirect jump. Today, this is usually negligible. Inside the DLL there is some more overhead on i386 CPU's, because they can't generate position independent code. On amd64, jumps can be relative to the program counter, so this is a huge improvement.
2) This is correct. With optimizations guided by profiling you can usually win about 10-15 percent performance. Now that CPU speed has reached its limits it might be worth doing it.
I would add: (3) the linker can arrange functions in a more cache efficient grouping, so that expensive cache level misses are minimised. It also might especially effect the startup time of applications (based on results i have seen with the Sun C++ compiler)
And don't forget that with DLLs no dead code elimination can be performed. Depending on the language, the DLL code might not be optimal either. Virtual functions are always virtual because the compiler doesn't know whether a client is overwriting it.
For these reasons, in case there is no real need for DLLs, then just use static compilation.
EDIT (to answer the comment, by user underscore)
Here is a good resource about the position independent code problem http://eli.thegreenplace.net/2011/11/03/position-independent-code-pic-in-shared-libraries/
As explained x86 does not have them AFAIK for anything else then 15 bit jump ranges and not for unconditional jumps and calls. That's why functions (from generators) having more then 32K have always been a problem and needed embedded trampolines.
But on popular x86 OS like Linux you do not need to care if the .so/DLL file is not generated with the gcc switch -fpic (which enforces the use of the indirect jump tables). Because if you don't, the code is just fixed like a normal linker would relocate it. But while doing this it makes the code segment non shareable and it would need a full mapping of the code from disk into memory and touching it all before it can be used (emptying most of the caches, hitting TLBs) etc. There was a time when this was considered slow.
So you would not have any benefit anymore.
I do not recall what OS (Solaris or FreeBSD) gave me problems with my Unix build system because I just wasn't doing this and wondered why it crashed until I applied -fPIC to gcc.

Dynamic linking is the only practical way to meet some license requirements such as the LGPL.

I agree with the points dnmckee mentions, plus:
Statically linked applications might be easier to deploy, since there are fewer or no additional file dependencies (.dll / .so) that might cause problems when they're missing or installed in the wrong place.

One reason to do a statically linked build is to verify that you have full closure for the executable, i.e. that all symbol references are resolved correctly.
As a part of a large system that was being built and tested using continuous integration, the nightly regression tests were run using a statically linked version of the executables. Occasionally, we would see that a symbol would not resolve and the static link would fail even though the dynamically linked executable would link successfully.
This was usually occurring when symbols that were deep seated within the shared libs had a misspelt name and so would not statically link. The dynamic linker does not completely resolve all symbols, irrespective of using depth-first or breadth-first evaluation, so you can finish up with a dynamically linked executable that does not have full closure.

1/ I've been on projects where dynamic linking vs static linking was benchmarked and the difference wasn't determined small enough to switch to dynamic linking (I wasn't part of the test, I just know the conclusion)
2/ Dynamic linking is often associated with PIC (Position Independent Code, code which doesn't need to be modified depending on the address at which it is loaded). Depending on the architecture PIC may bring another slowdown but is needed in order to get benefit of sharing a dynamically linked library between two executable (and even two process of the same executable if the OS use randomization of load address as a security measure). I'm not sure that all OS allow to separate the two concepts, but Solaris and Linux do and ISTR that HP-UX does as well.
3/ I've been on other projects which used dynamic linking for the "easy patch" feature. But this "easy patch" makes the distribution of small fix a little easier and of complicated one a versioning nightmare. We often ended up by having to push everything plus having to track problems at customer site because the wrong version was token.
My conclusion is that I'd used static linking excepted:
for things like plugins which depend on dynamic linking
when sharing is important (big libraries used by multiple processes at the same time like C/C++ runtime, GUI libraries, ... which often are managed independently and for which the ABI is strictly defined)
If one want to use the "easy patch", I'd argue that the libraries have to be managed like the big libraries above: they must be nearly independent with a defined ABI that must not to be changed by fixes.

Static linking is a process in compile time when a linked content is copied into the primary binary and becomes a single binary.
Cons:
compile time is longer
output binary is bigger
Dynamic linking is a process in runtime when a linked content is loaded. This technic allows to:
upgrade linked binary without recompiling a primary one that increase an ABI stability[About]
has a single shared copy
Cons:
start time is slower(linked content should be copied)
linker errors are thrown in runtime
[iOS Static vs Dynamic framework]

It is pretty simple, really. When you make a change in your source code, do you want to wait 10 minutes for it to build or 20 seconds? Twenty seconds is all I can put up with. Beyond that, I either get out the sword or start thinking about how I can use separate compilation and linking to bring it back into the comfort zone.

Best example for dynamic linking is, when the library is dependent on the used hardware. In ancient times the C math library was decided to be dynamic, so that each platform can use all processor capabilities to optimize it.
An even better example might be OpenGL. OpenGl is an API that is implemented differently by AMD and NVidia. And you are not able to use an NVidia implementation on an AMD card, because the hardware is different. You cannot link OpenGL statically into your program, because of that. Dynamic linking is used here to let the API be optimized for all platforms.

Dynamic linking requires extra time for the OS to find the dynamic library and load it. With static linking, everything is together and it is a one-shot load into memory.
Also, see DLL Hell. This is the scenario where the DLL that the OS loads is not the one that came with your application, or the version that your application expects.

On Unix-like systems, dynamic linking can make life difficult for 'root' to use an application with the shared libraries installed in out-of-the-way locations. This is because the dynamic linker generally won't pay attention to LD_LIBRARY_PATH or its equivalent for processes with root privileges. Sometimes, then, static linking saves the day.
Alternatively, the installation process has to locate the libraries, but that can make it difficult for multiple versions of the software to coexist on the machine.

Another issue not yet discussed is fixing bugs in the library.
With static linking, you not only have to rebuild the library, but will have to relink and redestribute the executable. If the library is just used in one executable, this may not be an issue. But the more executables that need to be relinked and redistributed, the bigger the pain is.
With dynamic linking, you just rebuild and redistribute the dynamic library and you are done.

Static linking includes the files that the program needs in a single executable file.
Dynamic linking is what you would consider the usual, it makes an executable that still requires DLLs and such to be in the same directory (or the DLLs could be in the system folder).
(DLL = dynamic link library)
Dynamically linked executables are compiled faster and aren't as resource-heavy.

static linking gives you only a single exe, inorder to make a change you need to recompile your whole program. Whereas in dynamic linking you need to make change only to the dll and when you run your exe, the changes would be picked up at runtime.Its easier to provide updates and bug fixes by dynamic linking (eg: windows).

There are a vast and increasing number of systems where an extreme level of static linking can have an enormous positive impact on applications and system performance.
I refer to what are often called "embedded systems", many of which are now increasingly using general-purpose operating systems, and these systems are used for everything imaginable.
An extremely common example are devices using GNU/Linux systems using Busybox. I've taken this to the extreme with NetBSD by building a bootable i386 (32-bit) system image that includes both a kernel and its root filesystem, the latter which contains a single static-linked (by crunchgen) binary with hard-links to all programs that itself contains all (well at last count 274) of the standard full-feature system programs (most except the toolchain), and it is less than 20 megabytes in size (and probably runs very comfortably in a system with only 64MB of memory (even with the root filesystem uncompressed and entirely in RAM), though I've been unable to find one so small to test it on).
It has been mentioned in earlier posts that the start-up time of a static-linked binaries is faster (and it can be a lot faster), but that is only part of the picture, especially when all object code is linked into the same file, and even more especially when the operating system supports demand paging of code direct from the executable file. In this ideal scenario the startup time of programs is literally negligible since almost all pages of code will already be in memory and be in use by the shell (and and init any other background processes that might be running), even if the requested program has not ever been run since boot since perhaps only one page of memory need be loaded to fulfill the runtime requirements of the program.
However that's still not the whole story. I also usually build and use the NetBSD operating system installs for my full development systems by static-linking all binaries. Even though this takes a tremendous amount more disk space (~6.6GB total for x86_64 with everything, including toolchain and X11 static-linked) (especially if one keeps full debug symbol tables available for all programs another ~2.5GB), the result still runs faster overall, and for some tasks even uses less memory than a typical dynamic-linked system that purports to share library code pages. Disk is cheap (even fast disk), and memory to cache frequently used disk files is also relatively cheap, but CPU cycles really are not, and paying the ld.so startup cost for every process that starts every time it starts will take hours and hours of CPU cycles away from tasks which require starting many processes, especially when the same programs are used over and over, such as compilers on a development system. Static-linked toolchain programs can reduce whole-OS multi-architecture build times for my systems by hours. I have yet to build the toolchain into my single crunchgen'ed binary, but I suspect when I do there will be more hours of build time saved because of the win for the CPU cache.

Another consideration is the number of object files (translation units) that you actually consume in a library vs the total number available. If a library is built from many object files, but you only use symbols from a few of them, this might be an argument for favoring static linking, since you only link the objects that you use when you static link (typically) and don't normally carry the unused symbols. If you go with a shared lib, that lib contains all translation units and could be much larger than what you want or need.

Load-time dynamic link library dispatching

I'd like my Windows application to be able to reference an extensive set of classes and functions wrapped inside a DLL, but I need to be able to guide the application into choosing the correct version of this DLL before it's loaded. I'm familiar with using dllexport / dllimport and generating import libraries to accomplish load-time dynamic linking, but I cannot seem to find any information on the interwebs with regard to possibly finding some kind of entry point function into the import library itself, so I can, specifically, use CPUID to detect the host CPU configuration, and make a decision to load a paricular DLL based on that information. Even more specifically, I'd like to build 2 versions of a DLL, one that is built with /ARCH:AVX and takes full advantage of SSE - AVX instructions, and another that assumes nothing is available newer than SSE2.
One requirement: Either the DLL must be linked at load-time, or there needs to be a super easy way of manually binding the functions referenced from outside the DLL, and there are many, mostly wrapped inside classes.
Bonus question: Since my libraries will be cross-platform, is there an equivalent for Linux based shared objects?

I recommend that you avoid dynamic resolution of your DLL from your executable if at all possible, since it is just going to make your life hard, especially since you have a lot of exposed interfaces and they are not pure C.
Possible Workaround
Create a "chooser" process that presents the necessary UI for deciding which DLL you need, or maybe it can even determine it automatically. Let that process move whatever DLL has been decided on into the standard location (and name) that your main executable is expecting. Then have the chooser process launch your main executable; it will pick up its DLL from your standard location without having to know which version of the DLL is there. No delay loading, no wonkiness, no extra coding; very easy.
If this just isn't an option for you, then here are your starting points for delay loading DLLs. Its a much rockier road.
Windows
LoadLibrary() to get the DLL in memory: https://msdn.microsoft.com/en-us/library/windows/desktop/ms684175(v=vs.85).aspx
GetProcAddress() to get pointer to a function: https://msdn.microsoft.com/en-us/library/windows/desktop/ms683212(v=vs.85).aspx
OR possibly special delay-loaded DLL functionality using a custom helper function, although there are limitations and potential behavior changes.. never tried this myself: https://msdn.microsoft.com/en-us/library/151kt790.aspx (suggested by Igor Tandetnik and seems reasonable).
Linux
dlopen() to get the SO in memory: http://pubs.opengroup.org/onlinepubs/009695399/functions/dlopen.html
dladdr() to get pointer to a function: http://man7.org/linux/man-pages/man3/dladdr.3.html

To add to qexyn's answer, one can mimic delay loading on Linux by generating a small static stub library which would dlopen on first call to any of it's functions and then forward actual execution to shared library. Generation of such stub library can be automatically generated by custom project-specific script or Implib.so:
# Generate stub
$ implib-gen.py libxyz.so
# Link it instead of -lxyz
$ gcc myapp.c libxyz.tramp.S libxyz.init.c

What is the runtime performance cost of including a library? [duplicate]

This question already has answers here:
Will there be a performance hit on including unused header files in C/C++?
(4 answers)
Closed 6 years ago.
Is there any runtime performance difference between including an entire library (with probably hundreds of functions) and then using only a single function like:
#include<foo>
int main(int argc, char *argv[]) {
bar();//from library foo
return 0;
}
And between pasting the relevant code fragment from the library directly into the code, like:
void bar() {
...
}
int main(int argc, char *argv[]) {
bar();//defined just above
return 0;
}
What would prevent me from mindlessly including all of my favourite (and most frequently used) libraries in the beginning of my C files? This popular thread C/C++: Detecting superfluous #includes? suggests that the compilation time would increase. But would the compiled binary be any different? Would the second program actually outperform the first one?
Related: what does #include <stdio.h> really do in a c program
Edit: the question here is different from the related Will there be a performance hit on including unused header files in C/C++? question as here there is a single file included. I am asking here if including a single file is any different from copy-pasting the actually used code fragments into the source. I have slightly adjusted the title to reflect this difference.

There is no performance difference as far as the final program is concerned. The linker will only link functions that are actually used to your program. Unused functions present in the library will not get linked.
If you include a lot of libraries, it might take longer time to compile the program.
The main reason why you shouldn't include all your "favourite libraries" is program design. Your file shouldn't include anything except the resources it is using, to reduce dependencies between files. The less your file knows about the rest of the program, the better. It should be as autonomous as possible.

This is not such simply question and so does not deserve a simple answer. There are a number of things that you may need to consider when determining what is more performant.
Your Compiler And Linker: Different compilers will optimize in different ways. This is something that is easily overlooked, and can cause some issues when making generalisations. For the most part modern compilers and linkers will optimize the binary to only include what is 100% necessary for executions. However not all compilers will optimize your binary.
Dynamic Linking: There are two types of linking when using other libraries. They behave in similar ways however are fundamentally different. When you link against a dynamic library the library will remain separate from the program and only execute at runtime. Dynamic libraries are usually known as shared libraries and are therefore should be treated as if they are used by multiple binaries. Because these libraries are often shared, the linker will not remove any functionality from the library as the linker does not know what parts of that library will be needed by all binaries within that system or OS. Because of this a binary linked against a dynamic library will have a small performance hit, especially immediately after starting the program. This performance hit will increase with the number of dynamic linkages.
Static Linking: When you link a binary against a static library (with an optimizing linker) the linker will 'know' what functionality you will need from that particular library and will remove functionality that will not be used in your resulting binary. Because of this the binary will become more efficient and therefore more performant. This does however come at a cost.
e.g.
Say you have a an operating system that uses a library extensively throughout a large number of binaries throughout the entire system. If you were to build that library as a shared library, all binaries will share that library, whilst perhaps using different functionality. Now say you statically link every binary against a library. You would end up with and extensively large duplication of binary functionality, as each binary would have a copy of the functionality it needed from that library.
Conclusion: It is worth noting that before asking the question what will make my program more performant, you should probably ask yourself what is more performant in your case. Is your program intended to take up the majority of your CPU time, probably go for a statically linked library. If your program is only run occasionally, probably go for a dynamically linked library to reduce disk usage. It is also worth noting that using a header based library will only give you a very marginal (if at all) performance gain over a statically linked binary, and will greatly increase your compilation time.

It depends greatly on the libraries and how they are structured, and possible on the compiler implementation.
The linker (ld) will only assemble code from the library that is referenced by the code, so if you have two functions a and b in a library, but only have references to a then function b may not be in the final code at all.
Header files (include), if they only contain declarations, and if the declarations does not result in references to the library, then you should not see any difference between just typing out the parts you need (as per your example) and including the entire header file.
Historically, the linker ld would pull code by the files, so as along as every function a and b was in different files when the library was created there would be no implications at all.
However, if the library is not carefully constructed, or if the compiler implementation does pull in every single bit of code from the lib whether needed or not, then you could have performance implications, as your code will be bigger and may be harder to fit into CPU cache, and the CPU execution pipeline would have to occasional wait to fetch instructions from main memory rather than from cache.

It depends heavily on the libraries in question.
They might initialize global state which would slow down the startup and/or shutdown of the program. Or they might start threads that do something in parallel to your code. If you have multiple threads, this might impact performance, too.
Some libraries might even modify existing library functions. Maybe to collect statistics about memory or thread usage or for security auditing purposes.

Dynamically re-linking library optimized for target CPU

I have a library that I would like to compile for different CPU generations (all x86), say I want one to make use of all the instructions Skylake has to offer, a fallback version for Haswell, one for Sandy Bridge and so on. I know I can achieve this with -march=xyz, however, I also want to make it easy to use the library. Ideally the user would only have to link against a stub library which then dynamically loads in the correct library optimized for the targets CPU. I know I can achieve this by exporting function pointers in the loader library which are then populated correctly, however, that doesn't really lend itself to a C++ applications since I can't export class methods for example.
Is there any way to force the dynamic linker to re-do the relocation? Ideally the loader library would provide either the most fallback version of the code that runs on all CPUs supported by the library, but can then load a more specialized version, or it really is just a stub library that gets replaced at runtime. Either way, ideally I would like to make it so that the user only ever has to link against one dynamic library and the rest happens automatically at runtime.
Ideally I'd like an answer that works both for Linux and OS X, but I also take one that's only applicable for one of these.

why do game engines prefer static libraries over dynamic link libraries

I've been reading a few gaming books. And they always prefer to create the engine as a static library over dynamic link. I am new to c++ so I am not highly knowledge when it comes to static libraries and dynamic link libraries. All I know is static libraries increase the size of your program, where DLL link libraries are loaded as you need them within your program.
[edit]
I've played games where it almost seemed they used DLL's to load in sound, lighting, and what not all individually. as the level was loading up. cause you don't necessarily need that when your at the game menu.

Dynamic link libraries need to be position independent; this can cause performance inefficiencies on some processor architectures.
Static libraries can be optimized when included in your program, e.g., by stripping dead code. This can improve cache performance.

By position independent, he means that since the game engine and DLL are completely separated, the DLL is stand-alone and cannot be interwoven into the game engine code, whereas statically linking a library allows the compiler to optimize using both your game engine code AND the library code.
For example, say there's a small function that the compiler thinks should be inlined (copied directly in place of a function call). Then with a static library, the compiler would be able to inline this code, since it knows what the code is (you're linking at compile-time). However, with a dynamic library, the compiler would be unable to inline that code, since it does not know what the code is (since it will be linking at run-time).

Another often overlooked reason which deserves mention is that for many games you aren't going to be running lots of other stuff, and many libraries that are used for games aren't going to be used for the other things that you may be running at the same time as a game, so you don't have to worry about one of the major positives you get from using shared libraries, which is that only one copy of (most of) the library needs to loaded at one time while several things can make use of that one copy. When running a game you will probably only have one program that would want to use that library running anyway because you probably aren't going to be running many other programs (particularly other games or 3D programs) at the same time.
You also open up the possibility of global/link time optimization, which is much more difficult with shared libraries.

Another question covers the differences between static and dynamic libraries: When to use dynamic vs. static libraries
As for why they use static libraries, the extra speed may be worth it and you can avoid DLL hell (was a big problem in the past). It's also useful if you want to distribute your program and libraries together, ensuring the recipient has the correct dependencies, though there's nothing stopping you from distributing DLLs together with the executable.

When developing games for a console, often dynamic linking isn't an option. If you want to use the engine for both console and PC development, it would be best to avoid dynamic linking.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js