Why are all libraries in Boost not headers-only?
Saying it differently, what makes the use of .lib/.dll mandatory?
Is it when a class can't be a template or has static fields?
Different points, I guess.
Binary size. Could header-only put a size burden on the client?
Compilation times. Could header-only mean a significant decrease in compilation performance?
Runtime Performance. Could header-only give superior performance?
Restrictions. Does the design require header-only?
About binary size.
and a bit of security
If there's a lot of reachable code in the boost library, or code about which the compiler can't argue whether it is reachable by the client, it has to be put into the final binary. (*)
On operating systems that have package management (e.g. RPM- or .deb-based), shared libraries can mean a big decrease in binary distribution size and have a security advantage: Security fixes are distributed faster and are then automatically used by all .so/.DLL users. So you had one recompile and one redistribution, but N profiteers. With a header-only library, you have N recompiles, N redistributions, always for each fix, and some member of those N are huge in themselves already.
(*) reachable here means "potentially executed"
About compilation times.
Some boost libraries are huge. If you would #include it all, each time you change a bit in your source-file, you have to recompile everything you #included.
This can be counter-measured with cherry picked headers, e.g.
#include <boost/huge-boost-library.hpp> // < BAD
#include <boost/huge-boost-library/just-a-part-of-it.hpp> // < BETTER
but sometimes the stuff you really need to include is already big enough to cripple your recompiles.
The countermeasure is to make it a static or shared library, in turn meaning "compile completely exactly once (until the next boost update)".
About runtime performance.
We are still not in an age were global optimization solves all of our C++ performance problems. To make sure you give the compiler all the information it needs, you can make stuff header-only and let the compiler make inlining decisions.
In that respect, note that inlining gives not always superior performance because of caching and speculation issues on the CPU.
Note also that this argument is mostly with regards to boost libraries that might be used frequently enough, e.g. one could expect boost::shared_ptr<> to be used very often, and thus be a relevant performance factor.
But consider the real and only relevant reason boost::shared_ptr<> is header-only ...
About restrictions.
Some stuff in C++ can not be put into libraries, namely templates and enumerations.
But note that this is only halfway true. You can write typesafe, templated interfaces to your real data structures and algorithms, which in turn have their runtime-generic implementation in a library.
Likewise, some stuff in C++ should be put into source files, and in case of boost, libraries. Basically, this is everything that would give "multiple definition" errors, like static member variables or global variables in general.
Some examples can also be found in the standard library: std::cout is defined in the standard as extern ostream cout;, and so cout basically requires the distribution of something (library or sourcefile) that defines it once and only once.
Related
There are several discussions on forums about shared vs. static libraries regarding performance. But how do those approaches compare to compiling the code altogether?
In my case, I have a class (the evaluation code) that contains a few methods that contain several for loops and that will be called several times by a method from another class (the evaluator code). I have not finished implementing and testing everything yet. But, for the sake of performance, I am wondering if I should compile all the files altogether (compiler optimization advantages?), or compile some files separately to generate static or shared libraries.
These approaches will depend on your compiler and options:
Not using libraries:
A good compiler, and build system will cache results and this should be just as fast as the other two. In practice, many code bases have less than optimal compartmentalization leading to slow compile times, the classic approach is to break the thing apart into libraries.
Static:
This might be slower than dynamic linking because there is an opportunity to run link time optimization (LTO), which could take a while
Dynamic:
Might be slower when you have a small number of functions because of specifics on how dynamic loading is implemented.
In conclusion, unless you're working on some monster project where you're worried about people truking up the build system, keep it all in one project and avoid needlessly complicated debugging.
This question already has answers here:
Will there be a performance hit on including unused header files in C/C++?
(4 answers)
Closed 6 years ago.
Is there any runtime performance difference between including an entire library (with probably hundreds of functions) and then using only a single function like:
#include<foo>
int main(int argc, char *argv[]) {
bar();//from library foo
return 0;
}
And between pasting the relevant code fragment from the library directly into the code, like:
void bar() {
...
}
int main(int argc, char *argv[]) {
bar();//defined just above
return 0;
}
What would prevent me from mindlessly including all of my favourite (and most frequently used) libraries in the beginning of my C files? This popular thread C/C++: Detecting superfluous #includes? suggests that the compilation time would increase. But would the compiled binary be any different? Would the second program actually outperform the first one?
Related: what does #include <stdio.h> really do in a c program
Edit: the question here is different from the related Will there be a performance hit on including unused header files in C/C++? question as here there is a single file included. I am asking here if including a single file is any different from copy-pasting the actually used code fragments into the source. I have slightly adjusted the title to reflect this difference.
There is no performance difference as far as the final program is concerned. The linker will only link functions that are actually used to your program. Unused functions present in the library will not get linked.
If you include a lot of libraries, it might take longer time to compile the program.
The main reason why you shouldn't include all your "favourite libraries" is program design. Your file shouldn't include anything except the resources it is using, to reduce dependencies between files. The less your file knows about the rest of the program, the better. It should be as autonomous as possible.
This is not such simply question and so does not deserve a simple answer. There are a number of things that you may need to consider when determining what is more performant.
Your Compiler And Linker: Different compilers will optimize in different ways. This is something that is easily overlooked, and can cause some issues when making generalisations. For the most part modern compilers and linkers will optimize the binary to only include what is 100% necessary for executions. However not all compilers will optimize your binary.
Dynamic Linking: There are two types of linking when using other libraries. They behave in similar ways however are fundamentally different. When you link against a dynamic library the library will remain separate from the program and only execute at runtime. Dynamic libraries are usually known as shared libraries and are therefore should be treated as if they are used by multiple binaries. Because these libraries are often shared, the linker will not remove any functionality from the library as the linker does not know what parts of that library will be needed by all binaries within that system or OS. Because of this a binary linked against a dynamic library will have a small performance hit, especially immediately after starting the program. This performance hit will increase with the number of dynamic linkages.
Static Linking: When you link a binary against a static library (with an optimizing linker) the linker will 'know' what functionality you will need from that particular library and will remove functionality that will not be used in your resulting binary. Because of this the binary will become more efficient and therefore more performant. This does however come at a cost.
e.g.
Say you have a an operating system that uses a library extensively throughout a large number of binaries throughout the entire system. If you were to build that library as a shared library, all binaries will share that library, whilst perhaps using different functionality. Now say you statically link every binary against a library. You would end up with and extensively large duplication of binary functionality, as each binary would have a copy of the functionality it needed from that library.
Conclusion: It is worth noting that before asking the question what will make my program more performant, you should probably ask yourself what is more performant in your case. Is your program intended to take up the majority of your CPU time, probably go for a statically linked library. If your program is only run occasionally, probably go for a dynamically linked library to reduce disk usage. It is also worth noting that using a header based library will only give you a very marginal (if at all) performance gain over a statically linked binary, and will greatly increase your compilation time.
It depends greatly on the libraries and how they are structured, and possible on the compiler implementation.
The linker (ld) will only assemble code from the library that is referenced by the code, so if you have two functions a and b in a library, but only have references to a then function b may not be in the final code at all.
Header files (include), if they only contain declarations, and if the declarations does not result in references to the library, then you should not see any difference between just typing out the parts you need (as per your example) and including the entire header file.
Historically, the linker ld would pull code by the files, so as along as every function a and b was in different files when the library was created there would be no implications at all.
However, if the library is not carefully constructed, or if the compiler implementation does pull in every single bit of code from the lib whether needed or not, then you could have performance implications, as your code will be bigger and may be harder to fit into CPU cache, and the CPU execution pipeline would have to occasional wait to fetch instructions from main memory rather than from cache.
It depends heavily on the libraries in question.
They might initialize global state which would slow down the startup and/or shutdown of the program. Or they might start threads that do something in parallel to your code. If you have multiple threads, this might impact performance, too.
Some libraries might even modify existing library functions. Maybe to collect statistics about memory or thread usage or for security auditing purposes.
In C++ the standard library is wrapped in the std namespace and the programmer is not supposed to define anything inside that namespace. Of course the standard include files don't step on each other names inside the standard library (so it's never a problem to include a standard header).
Then why isn't the whole standard library included by default instead of forcing programmers to write for example #include <vector> each time? This would also speed up compilation as the compilers could start with a pre-built symbol table for all the standard headers.
Pre-including everything would also solve some portability problems: for example when you include <map> it's defined what symbols are taken into std namespace, but it's not guaranteed that other standard symbols are not loaded into it and for example you could end up (in theory) with std::vector also becoming available.
It happens sometimes that a programmer forgets to include a standard header but the program compiles anyway because of an include dependence of the specific implementation. When moving the program to another environment (or just another version of the same compiler) the same source code could however stop compiling.
From a technical point of view I can image a compiler just preloading (with mmap) an optimal perfect-hash symbol table for the standard library.
This should be faster to do than loading and doing a C++ parse of even a single standard include file and should be able to provide faster lookup for std:: names. This data would also be read-only (thus probably allowing a more compact representation and also shareable between multiple instances of the compiler).
These are however just shoulds as I never implemented this.
The only downside I see is that we C++ programmers would lose compilation coffee breaks and Stack Overflow visits :-)
EDIT
Just to clarify the main advantage I see is for the programmers that today, despite the C++ standard library being a single monolithic namespace, are required to know which sub-part (include file) contains which function/class. To add insult to injury when they make a mistake and forget an include file still the code may compile or not depending on the implementation (thus leading to non-portable programs).
Short answer is because it is not the way the C++ language is supposed to be used
There are good reasons for that:
namespace pollution - even if this could be mitigated because std namespace is supposed to be self coherent and programmer are not forced to use using namespace std;. But including the whole library with using namespace std; will certainly lead to a big mess...
force programmer to declare the modules that he wants to use to avoid inadvertently calling a wrong standard function because standard library is now huge and not all programmers know all modules
history: C++ has still strong inheritance from C where namespace do not exist and where the standard library is supposed to be used as any other library.
To go in your sense, Windows API is an example where you only have one big include (windows.h) that loads many other smaller include files. And in fact, precompiled headers allows that to be fast enough
So IMHO a new language deriving from C++ could decide to automatically declare the whole standard library. A new major release could also do it, but it could break code intensively using using namespace directive and having custom implementations using same names as some standard modules.
But all common languages that I know (C#, Python, Java, Ruby) require the programmer to declare the parts of the standard library that he wants to use, so I suppose that systematically making available every piece of the standard library is still more awkward than really useful for the programmer, at least until someone find how to declare the parts that should not be loaded - that's why I spoke of a new derivative from C++
Most of the C++ standard libraries are template based which means that the code they'll generate will depend ultimately in how you use them. In other words, there is very little that could be compiled before instantiate a template like std::vector<MyType> m_collection;.
Also, C++ is probably the slowest language to compile and there is a lot parsing work that compilers have to do when you #include a header file that also includes other headers.
Well, first thing first, C++ tries to adhere to "you only pay for what you use".
The standard-library is sometimes not part of what you use at all, or even of what you could use if you wanted.
Also, you can replace it if there's a reason to do so: See libstdc++ and libc++.
That means just including it all without question isn't actually such a bright idea.
Anyway, the committee are slowly plugging away at creating a module-system (It takes lots of time, hopefully it will work for C++1z: C++ Modules - why were they removed from C++0x? Will they be back later on?), and when that's done most downsides to including more of the standard-library than strictly neccessary should disappear, and the individual modules should more cleanly exclude symbols they need not contain.
Also, as those modules are pre-parsed, they should give the compilation-speed improvement you want.
You offer two advantages of your scheme:
Compile-time performance. But nothing in the standard prevents an implementation from doing what you suggest[*] with a very slight modification: that the pre-compiled table is only mapped in when the translation unit includes at least one standard header. From the POV of the standard, it's unnecessary to impose potential implementation burden over a QoI issue.
Convenience to programmers: under your scheme we wouldn't have to specify which headers we need. We do this in order to support C++ implementations that have chosen not to implement your idea of making the standard headers monolithic (which currently is all of them), and so from the POV of the C++ standard it's a matter of "supporting existing practice and implementation freedom at a cost to programmers considered acceptable". Which is kind of the slogan of C++, isn't it?
Since no C++ implementation (that I know of) actually does this, my suspicion is that in point of fact it does not grant the performance improvement you think it does. Microsoft provides precompiled headers (via stdafx.h) for exactly this performance reason, and yet it still doesn't give you an option for "all the standard libraries", instead it requires you to say what you want in. It would be dead easy for this or any other implementation to provide an implementation-specific header defined to have the same effect as including all standard headers. This suggests to me that at least in Microsoft's opinion there would be no great overall benefit to providing that.
If implementations were to start providing monolithic standard libraries with a demonstrable compile-time performance improvement, then we'd discuss whether or not it's a good idea for the C++ standard to continue permitting implementations that don't. As things stand, it has to.
[*] Except perhaps for the fact that <cassert> is defined to have different behaviour according to the definition of NDEBUG at the point it's included. But I think implementations could just preprocess the user's code as normal, and then map in one of two different tables according to whether it's defined.
I think the answer comes down to C++'s philosophy of not making you pay for what you don't use. It also gives you more flexibility: you aren't forced to use parts of the standard library if you don't need them. And then there's the fact that some platforms might not support things like throwing exceptions or dynamically allocating memory (like the processors used in the Arduino, for example). And there's one other thing you said that is incorrect. As long as it's not a template class, you are allowed to add swap operators to the std namespace for your own classes.
First of all, I am afraid that having a prelude is a bit late to the game. Or rather, seeing as preludes are not easily extensible, we have to content ourselves with a very thin one (built-in types...).
As an example, let's say that I have a C++03 program:
#include <boost/unordered_map.hpp>
using namespace std;
using boost::unordered_map;
static unordered_map<int, string> const Symbols = ...;
It all works fine, but suddenly when I migrate to C++11:
error: ambiguous symbol "unordered_map", do you mean:
- std::unordered_map
- boost::unordered_map
Congratulations, you have invented the least backward compatible scheme for growing the standard library (just kidding, whoever uses using namespace std; is to blame...).
Alright, let's not pre-include them, but still bundle the perfect hash table anyway. The performance gain would be worth it, right?
Well, I seriously doubt it. First of all because the Standard Library is tiny compared to most other header files that you include (hint: compare it to Boost). Therefore the performance gain would be... smallish.
Oh, not all programs are big; but the small ones compile fast already (by virtue of being small) and the big ones include much more code than the Standard Library headers so you won't get much mileage out of it.
Note: and yes, I did benchmark the file look-up in a project with "only" a hundred -I directives; the conclusion was that pre-computing the "include path" to "file location" map and feeding it to gcc resulted in a 30% speed-up (after using ccache already). Generating it and keeping it up-to-date was complicated, so we never used it...
But could we at least include a provision that the compiler could do it in the Standard?
As far as I know, it is already included. I cannot remember if there is a specific blurb about it, but the Standard Library is really part of the "implementation" so resolving #include <vector> to an internal hash-map would fall under the as-if rule anyway.
But they could do it, still!
And lose any flexibility. For example Clang can use either the libstdc++ or the libc++ on Linux, and I believe it to be compatible with the Dirkumware's derivative that ships with VC++ (or if not completely, at least greatly).
This is another point of customization: if the Standard library does not fit your needs, or your platforms, by virtue of being mostly treated like any other library you can replace part of most of it with relative ease.
But! But!
#include <stdafx.h>
If you work on Windows, you will recognize it. This is called a pre-compiled header. It must be included first (or all benefits are lost) and in exchange instead of parsing files you are pulling in an efficient binary representation of those parsed files (ie, a serialized AST version, possibly with some type resolution already performed) which saves off maybe 30% to 50% of the work. Yep, this is close to your proposal; this is Computer Science for you, there's always someone else who thought about it first...
Clang and gcc have a similar mechanism; though from what I've heard it can be so painful to use that people prefer the more transparent ccache in practice.
And all of these will come to naught with modules.
This is the true solution to this pre-processing/parsing/type-resolving madness. Because modules are truly isolated (ie, unlike headers, not subject to inclusion order), an efficient binary representation (like pre-compiled headers) can be pre-computed for each and every module you depend on.
This not only means the Standard Library, but all libraries.
Your solution, more flexible, and dressed to the nines!
One could use an alternative implementation of the C++ Standard Library to the one shipped with the compiler. Or wrap headers with one's definitions, to add, enable or disable features (see GNU wrapper headers). Plain text headers and the C inclusion model are a more powerful and flexible mechanism than a binary black box.
I have seen a couple of questions on how to detect unnecessary #include files in a C++ project. This question has often intrigued me, but I have never found a satisfactory answer.
If there are some header files included which, are not being used in a c++ project, is that an overhead? I understand that it means that before compilation the contents of all the header files would be copied into the included source files and that would result in a lot of unnecessary compilation.
How far does this kind of overhead spread to the compiled object files and binaries?
Aren't compilers able to do some optimizations to make sure that this
kind of overhead is not transferred to the resulting object files and
binaries ?
Considering the fact, that I probably know nothing about compiler optimization, I still want to ask this, in case there is an answer.
As a programmer who uses a wide variety of c++ libraries for his work,
what kind of programming practices should I follow to keep avoiding
such overheads ? Is making myself intimately familiar with each
library's working the only way out ?
It does not affect the performance of the binary or even the contents of the binary file, for almost all headers. Declarations generate no code at all, inline/static/anonymous-namespace definitions are optimized away if they aren't used, and no header should include externally visible definitions (that breaks if the header is included by more than one translation unit).
As #T.C. points out, the exception are internally visible static objects with nontrivial constructors. iostream does this, for example. The program must behave as if the constructor is called, and the compiler usually doesn't have enough information to optimize the constructor away.
It does, however, affect how long compilation takes and how many files will be recompiled when a header is changed. For large projects, this is enough incentive to care about unnecessary includes.
Besides the obviously longer compile times, there might be other issues. The most important one IMHO is dependencies to external libraries. You don't want your program to depend on more libraries then necessary.
You also then need to install those libraries in every system you want to the program to build on. This can become a nightmare, especially when the next programmer needs to install some database client library although the program never uses a database.
Also, especially library headers often tend to define macros. Sometimes those macros have very generic names which will break you code or which are incompatible with other library headers you might actually need.
Of course any #include is an overhead. The compiler needs to parse that file.
So avoid them. Use forward declarations where ever possible.
It will speed up compilation. See Scott Myers book on the subject
The simple answer is YES its an overhead as far as the compilation is concerned but for runtime it is merely going to create any difference. Reason being lets say you add #include <iostream> (just for example) and assume that you are not using any of its function then g++ 4.5.2 has some additional 18,560 lines of code to process(compilation). But as far as the runtime overhead is concerned I hardly think that it creates a performance issue.
You can also refer Are unused includes harmful in C/C++? where I really liked this point made by David Young
Any singletons declared as external in a header and defined in a
source file will be included in your program. This obviously increases
memory usage and possibly contributes to a performance overhead by
causing one to access their page file more often (not much of a
problem now, as singletons are usually small-to-medium in size and
because most people I know have 6+ GB of RAM).
I've noticed that when I use a boost feature the app size tends to increase by about .1 - .3 MB. This may not seem like much, but compared to using other external libraries it is (for me at least). Why is this?
Boost uses templates everywhere. These templates can be instantiated multiple times with the same parameters. A sufficiently smart linker will throw out all but one copy. However, not all linkers are sufficiently smart. Also, templates are instantiated implicitly sometimes and it's hard to even know how many times one has been instantiated.
"so much" is a comparative term, and I'm afraid you're comparing apples to oranges. Just because other libraries are smaller doesn't imply you should assume Boost is as small.
Look at the sheer amount of work Boost does for you!
I doubt making a custom library with the same functionality would be of any considerable lesser size. The only valid comparison to make is "Boost's library that does X" versus "Another library that does X". Not "Boost's library that does X" and "Another library that does Y."
The file system library is very powerful, and this means lots of functions, and lot's of back-bone code to provide you and I with a simple interface. Also, like others mentioned templates in general can increase code size, but it's not like that's an avoidable thing. Templates or hand-coded, either one will results in the same size code. The only difference is templates are much easier.
It all depends on how it is used. Since Boost is a bunch of templates, it causes a bunch of member functions to be compiled per type used. If you use boost with n types, the member functions are defined (by C++ templates) n times, one for each type.
Boost consists primarily of very generalized and sometimes quite complex templates, which means, types and functions are created by the compiler as required per usage, and not simply by declaration. In other words, a small amount of source code can produce a significant quantity of object code to fulfill all variations of templates declared or used. Boost also depends on standard libraries, pulling in those dependencies as well. However, the most significant contribution is the fact that Boost source code exists almost primarily in include files. Including standard c include files (outside of STL) typically includes very little source code and contains mostly prototypes, small macros or type declarations without their implementations. Boost contains most of its implementations in its include file.