Are there any performance implications to including every header? - c++

Lets say I want to use hex() function. I know it is defined in <ios> header and I also know that it is included in <iostream> header. The difference is that in <iostream> are much more functions and other stuff I don't need.
From a performance stand-point, should I care about including/defining less functions, classes etc. than more?

There is no run time performance hit.
However, there could be excessive compile time hit if tons of unnecessary headers are included.
Also, when this is done, you can create unnecessary recompiles if, for instance, a header is changed but a file that doesn't use it includes it.
In small projects (with small headers included), this doesn't matter. As a project grows, it may.

If the standard says it is defined in header <ios> then include header <ios> because you can't guarantee it will be included in/through any other header.

TL;DR: In general, it is better to only include what you need. Including more can have an adverse effect on binary size and startup (should be insignificant), but mostly hurts compilation-time without precompiled headers.
Well, naturally you have to include at least those headers together guaranteed to cover all your uses.
It might sometimes happen to "work" anyway, because the standard C++ headers are all allowed to include each other as the implementer wants, and the headers are allowed to include additional symbols in the std-namespace anyway (see Why is "using namespace std" considered bad practice?).
Next, sometimes including an additional header might lead to creation of additional objects (see std::ios_base::Init), though a well-designed library minimizes such (that is the only instance in the standard library, as far as I know).
But the big issue isn't actually size and efficiency of the compiled (and optimized) binary (which should be unaffected, aside from the previous point, whose effect should be miniscule), but compilation-time while actively developing (see also How does #include <bits/stdc++.h> work in C++?).
And the latter is (severely, so much that the comittee is working on a modules-proposal, see C++ Modules - why were they removed from C++0x? Will they be back later on?) adversely affected by adding superfluous headers.
Unless, naturally, you are using precompiled-headers (see Why use Precompiled Headers (C/C++)?), in which case including more in the precompiled headers and thus everywhere instead of only where needed, as long as those headers are not modified, will actually reduce compile-times most of the time.
There is a clang-based tool for finding out the minimum headers, called include-what-you-use.
It analyzes the clang AST to decide that, which is both a strength and a weakness:
You don't need to teach it about all the symbols a header makes available, but it also doesn't know whether things just worked out that way in that revision, or whether they are contractual.
So you need to double-check its results.

Including unnecessary headers has following downsides.
Longer compile time, linker has to remove all the unused symbols.
If you have added extra headers in CPP, it will only affect your code.
But if you are distributing your code as a library and you have added unnecessary headers in your header files. Client code will be burdened with locating the headers that you have used.
Do not trust indirect inclusion, use the header in which required function is actually defined.
Also in a project as a good programming practice headers should be included in order of reducing dependency.
//local header -- most dependent on other headers
#include <project/impl.hpp>
//Third party library headers -- moderately dependent on other headers
#include <boost/optional.hpp>
//standard C++ header -- least dependent on other header
#include <string>
And things that won't be affected is run-time, linker will get rid of unused symbols during compilation.

Including unneeded header files has some value.
It does take less coding effort to include a cut and paste of the usually needed includes. Of course, later coding is now encumbered with not knowing what was truly needed.
Especially in C, with its limited name space control, including unneeded headers promptly detects collisions. Say code defined a global non-static variable or function that happened to match the standard, like erfc() to do some text processing. By including <math.h>, the collision is detected with double erfc(double x), even though this .c file does no FP math yet other .c files do.
#include <math.h>
char *erfc(char *a, char *b);
OTOH, had this .c file not included <math.h>, at link time, the collision would be detected. The impact of this delayed notice could be great if the code base for years did not need FP math and now does, only to detect char *erfc(char *a, char *b) used in many places.
IMO: Make a reasonable effort to not include unneeded header files, but do not worry about including a few extra, especially if they are common ones. If an automated method exist, use it to control header file inclusion.

Related

The use of double include guards in C++

So I recently had a discussion where I work, in which I was questioning the use of a double include guard over a single guard. What I mean by double guard is as follows:
Header file, "header_a.hpp":
#ifndef __HEADER_A_HPP__
#define __HEADER_A_HPP__
...
...
#endif
When including the header file anywhere, either in a header or source file:
#ifndef __HEADER_A_HPP__
#include "header_a.hpp"
#endif
Now I understand that the use of the guard in header files is to prevent multiple inclusion of an already defined header file, it's common and well documented. If the macro is already defined, the entire header file is seen as 'blank' by the compiler and the double inclusion is prevented. Simple enough.
The issue I don't understand is using #ifndef __HEADER_A_HPP__ and #endif around the #include "header_a.hpp". I'm told by the coworker that this adds a second layer of protection to inclusions but I fail to see how that second layer is even useful if the first layer absolutely does the job (or does it?).
The only benefit I can come up with is that it outright stops the linker from bothering to find the file. Is this meant to improve compilation time (which was not mentioned as a benefit), or is there something else at work here that I am not seeing?
I am pretty sure that it is a bad practice to add another include guard like:
#ifndef __HEADER_A_HPP__
#include "header_a.hpp"
#endif
Here are some reasons why:
To avoid double inclusion it is enough to add a usual include guard inside the header file itself. It does the job well. Another include guard in the place of inclusion just messes the code and reduces readability.
It adds unnecessary dependencies. If you change include guard inside the header file you have to change it in all places where the header is included.
It is definitely not the most expensive operation comparing the whole compilation/linkage process so it can hardly reduce the total build time.
Any compiler worth anything already optimizes file-wide include-guards.
The reason for putting include guards in the header file is to prevent the contents of the header from being pulled into a translation unit more than once. That's normal, long-established practice.
The reason for putting redundant include guards in a source file is to avoid having to open the header file that's being included, and back in the olden days that could significantly speed up compilation. These days, opening a file is much faster than it used to be; further, compilers are pretty smart about remembering which files they've already seen, and they understand the include guard idiom, so can figure out on their own that they don't need to open the file again. That's a bit of hand-waving, but the bottom line is that this extra layer isn't needed any more.
EDIT: another factor here is that compiling C++ is far more complicated than compiling C, so it takes far longer, making the time spent opening include files a smaller, less significant part of the time it takes to compile a translation unit.
The only benefit I can come up with is that it outright stops the linker from bothering to find the file.
The linker will not be affected in any way.
It could prevent the pre-processor from bothering to find the file, but if the guard is defined, that means that it has already found the file. I suspect that if the pre-process time is reduced at all, the effect would be quite minimal except in the most pathologically recursively included monstrosity.
It has a downside that if the guard is ever changed (for example due to conflict with another guard), all the conditionals before the include directives must be changed in order for them to work. And if something else uses the previous guard, then the conditionals must be changed for the include directive itself to work correctly.
P.S. __HEADER_A_HPP__ is a symbol that is reserved to the implementation, so it is not something that you may define. Use another name for the guard.
Older compilers on more traditional (mainframe) platforms (we're talking mid-2000s here) did not used to have the optimisation described in other answers, and so it really did used to significantly slow down preprocessing time having to re-read header files that have already been included (bearing in mind in a big, monolithic, enterprise-y project you're going to be including a LOT of header files). As an example, I've seen data that indicates a 26-fold speedup for a file with 256 header files each including the same 256 header files on the VisualAge C++ 6 for AIX compiler (which dates from the mid-2000s). This is a rather extreme example but this sort of speed-up does add up.
However, all modern compilers, even on mainframe platforms such as AIX and Solaris, perform enough optimisation for header inclusion that the difference these days really is negligible. Therefore there is no good reason to have these any more.
This does, however, explain why some companies still hang on to the practice, because relatively recently (at least in C/C++ codebase age terms) it was still worthwhile for very large monolithic projects.
Although there are people arguing against it, in practice '#pragma once' works perfectly and the main compilers (gcc/g++, vc++) support it.
So whatever puristic argumentation people are spreading, it works a lot better:
Fast
No maintenance, no trouble with mysterious non-inclusion because you copied an old flag
Single line with obvious meaning versus cryptic lines spread in file
So simply put:
#pragma once
at the start of the file, and that's it. Optimized, maintainable, and ready to go.

How does #include <bits/stdc++.h> work in C++? [duplicate]

This question already has answers here:
Why should I not #include <bits/stdc++.h>?
(9 answers)
Closed 4 years ago.
I have read from a codeforces blog that if we add #include <bits/stdc++.h> in a C++ program then there is no need to include any other header files. How does #include <bits/stdc++.h> work and is it ok to use it instead of including individual header files?
It is basically a header file that also includes every standard library and STL include file. The only purpose I can see for it would be for testing and education.
Se e.g. GCC 4.8.0 /bits/stdc++.h source.
Using it would include a lot of unnecessary stuff and increases compilation time.
Edit: As Neil says, it's an implementation for precompiled headers. If you set it up for precompilation correctly it could, in fact, speed up compilation time depending on your project. (https://gcc.gnu.org/onlinedocs/gcc/Precompiled-Headers.html)
I would, however, suggest that you take time to learn about each of the sl/stl headers and include them separately instead, and not use "super headers" except for precompilation purposes.
#include <bits/stdc++.h> is an implementation file for a precompiled header.
From a software engineering perspective, it is a good idea to minimize the include. If you use <bits/stdc++.h> it actually includes a lot of files, which your program may not need, thus increase both compile-time and program size unnecessarily. [edit: as pointed out by #Swordfish in the comments that the output program size remains unaffected. But still, it's good practice to include only the libraries you actually need, unless it's some competitive competition ]
But in contests, using this file is a good idea, when you want to reduce the time wasted in doing chores; especially when your rank is time-sensitive.
It works in most online judges, programming contest environments, including ACM-ICPC (Sub-Regionals, Regionals, and World Finals) and many online judges.
The disadvantages of it are that it:
increases the compilation time.
uses an internal non-standard header file of the GNU C++ library, and so will not compile in MSVC, XCode, and many other compilers
That header file is not part of the C++ standard, is therefore non-portable, and should be avoided.
Moreover, even if there were some catch-all header in the standard, you would want to avoid it in lieu of specific headers, since the compiler has to actually read in and parse every included header (including recursively included headers) every single time that translation unit is compiled.
Unfortunately that approach is not portable C++ (so far).
All standard names are in namespace std and moreover you cannot know which names are NOT defined by including and header (in other words it's perfectly legal for an implementation to declare the name std::string directly or indirectly when using #include <vector>).
Despite this however you are required by the language to know and tell the compiler which standard header includes which part of the standard library. This is a source of portability bugs because if you forget for example #include <map> but use std::map it's possible that the program compiles anyway silently and without warnings on a specific version of a specific compiler, and you may get errors only later when porting to another compiler or version.
In my opinion there are no valid technical excuses that explain why this is necessary for the general user: the compiler binary could have all standard namespace built in and this could actually increase the performance even more than precompiled headers (e.g. using perfect hashing for lookups, removing standard headers parsing or loading/demarshalling and so on).
The use of standard headers simplifies the life of who builds compilers or standard libraries and that's all. It's not something to help users.
However this is the way the language is defined and you need to know which header defines which names so plan for some extra neurons to be burnt in pointless configurations to remember that (or try to find and IDE that automatically adds the standard headers you use and removes the ones you don't... a reasonable alternative).

Include directives in header file? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
where should “include” be put in C++
Obviously, there are two "schools of thought" as to whether to put #include directives into C++ header files (or, as an alternative, put #include only into cpp files). Some people say it's ok, others say it only causes problems. Does anybody know whether this discussion has reached a conclusion what is to be preferred?
I am not aware of any schools of thoughts concerning this. Put them in the header when they are needed there, otherwise forward declare and put them in the .cpp files that require them. There is no benefit in including headers where they are not needed.
What I found effective is following a few simple rules:
Headers shall be self-sufficient, i.e., they shall declare classes they need names for and include headers for any definition they use.
Headers should minimize dependencies as much as possible without violation the previous point.
Getting the first point rught is fairly easy: Include the header first thing from the source implementing what it declares. Getting the second point exactly right isn't trivial, though, and I think it requires tool support to get it exactly right. However, a few unnecessary dependencies generally aren't that bad.
As a rule of thumb, you don't include the headers in a header as long as full definition of them is necessary there. Most of the time you play around with pointers of classes in a header file so it's just fine to forward declare them there.
I think the issue was settle a long time ago: headers should be self-contained (that is should not depend on the user to have included other headers before -- that aspect is settle for so long that some aren't even aware there was a debate on this, but your put includes only in .cpp seems to hint at this) but minimal (i.e. should not include definitions when a declaration would be enough for self-containment).
The reason for self-containment is maintenance: should an header be modified and now depend on something new, you'd have to track all the place it is used to include the new dependency. BTW, the standard trick to ensure self-containment is to include the header providing the declarations for things defined in a .cpp first in the .cpp.
These are not schools of thought so much as religions. In reality, both approaches have their advantages and disadvantages, and there are certain practices to be followed for either approach to be successful. But only one of these approaches will "scale" to large projects.
The advantage of not including headers inside headers is faster compilation. However, this advantage does not come from headers being read only once, because even if you include headers inside headers, smart compilers can work that out. The speed advantage comes from the fact that you include only those headers which are strictly necessary for a given source file. Another advantage is that if we look at a source file, we can see exactly what its dependencies are: the flat list of header files gives that to us plainly.
However, this practice is hard to maintain, especially in large projects with many programmers. It's quite an inconvenience when you want to use module foo, but you cannot just #include "foo.h": you need to include 35 other headers.
What ends up happening is this: programmers are not going to waste their time discovering the exact, minimal set of headers that they need just to add module foo. To save time, they will go to some example source file similar to the one they are working on, and cut and paste all of the #include directives. Then they will try compiling it, and if it doesn't build, then they will cut and paste more #include directives from yet elsewhere, and repeat that until it works.
The net result is that, little by little, you lose the advantage of faster compiling, because your files are now including unnecessary headers. Moreover, the list of #include directives no longer shows the true dependencies. Moreover, when you do incremental compiles now, you compile more than is necessary due to these false dependencies.
Once every source file includes nearly every header, you might as well have a big everything.h which includes all the headers, and then #include "everything.h" in every source file.
So this practice of including just specific headers is best left to small projects that are carefully maintained by a handful of developers who have plenty of time to maintain the ethic of minimal include dependencies by hand, or write tools to hunt down unnecessary #include directives.

Are unused includes harmful in C/C++?

What are the negative consequences of unused includes?
I'm aware they result in increased binary size (or do they?), anything else?
Increases compilation time (potentially serious issue)
Pollutes global namespace.
Potential clash of preprocessor names.
If unused headers are included from third-party libraries, it may make such libraries unnecessarily maintained as dependencies.
They don't necessarily increase binary size, but will increase compile time.
The main problem is clutter. These are the three main aspects in which the clutter manifests:
Visual pollution; while you are trying to figure other includes that you do need.
Logical pollution; it is more likely to have collision of functions, more time to compile (it might be really small for a couple of includes, but if it becomes a "policy" to not cleaning up unneeded includes, it might become a significant hurdle).
Dependency opacity; since there are more headers to analyze is it harder to determine the dependency cycles in your code. Knowing what are the dependencies in your code is crucial when your codebase grows to any significant level beyond the hobbyist level.
Generally speaking, yes, it does cause some problems. Logically speaking, if you don't need it then don't include it.
Any singletons declared as external in a header and defined in a source file will be included in your program. This obviously increases memory usage and possibly contributes to a performance overhead by causing one to access their page file more often (not much of a problem now, as singletons are usually small-to-medium in size and because most people I know have 6+ GB of RAM).
Compilation time is increased, and for large commercial projects where one compiles often, this can cause a loss of money. It might only add a few seconds on to your total time, but multiply that by the several hundred compiles or so you might need to test and debug and you've got a huge waste of time which thus translates into a loss in profit.
The more headers you have, the higher the chance that you may have a prepossessing collision with a macro you defined in your program or another header. This can be avoided via correct use of namespaces but it's still such a hassle to find. Again, lost profit.
Contributes to code bloat (longer files and thus more to read) and can majorly increase the number of results you find in your IDE's auto complete tool (some people are religiously against these tools, but they do increase productivity to an extent).
You can accidentally link other external libraries into your program without even knowing it.
You may inadvertently cause the end of the world by doing this.
I'll assume the headers can all be considered as "sincere", that is, are not precisely written with the aim of sabotaging your code.
It will usually slow the compilation (pre-compiled headers will mitigate this point)
it implies dependencies where none really exist (this is a semantic errors, not an actual error)
macros will pollute your code (mitigated by the prefixing of macros with namespace-like names, as in BOOST_FOREACH instead of FOREACH)
an header could imply a link to another library. in some case, an unused header could ask the linker to link your code with an external library (see MSCV's #pragma comment(lib, "")). I believe a good linker would not keep the library's reference if it's not used (IIRC, MSVC's linker will not keep the reference of an unused library).
a removed header is one less source of unexpected bugs. if you don't trust the header (some coders are better than others...), then removing it removes a risk (you won't like including an header changing the struct alignment of everything after it : the generated bugs are... illuminating...).
an header's static variable declaration will pollute your code. Each static variable declaration will result in a global variable declared in your compiled source.
C symbol names will pollute your code. The declarations in the header will pollute your global or struct namespace (and more probably, both, as structs are usually typedef-ed to bring their type into the global namespace). This is mitigated by libraries prefixing their symbols with some kind of "namespace name", like SDL_CreateMutex for SDL.
non-namespaced C++ symbol names will pollute your code. For the same reasons above. The same goes for headers making wrong use of the using namespace statement. Now, correct C++ code will namespace its symbols. Yes, this means that you should usually not trust a C++ header declaring its symbols in the global namespace...
Whether or not they increase the binary size really depends on what's in them.
The main side-effect is probably the negative impact on compilation speed. Again, how big an impact depends on what's in them, how much and whether they include other headers.
Well for one leaving them there only prolong the compile time and adds unnecessary compilation dependencies.
They represent clumsy design.
If you are not sure what to include and what not to include, it shows the developer had no idea what he was doing.
Include files are meant to be included only when the are need. It may not be that much of issue as the computer memory and speed is growing by leaps and bounds these days but it was once perhaps.
If an include is not needed but included anyhow, I would recommend to put a comment next to it saying why you included it. If a new developers get on to your code, he will have much appreciation for you, if you have done it the right way.
include means you are adding some more declarations. So when you are writing your own global function, you need to be carefull wheather that function is already declaerd in the header included.
Ex. if you write your own class auto_ptr{} without including "memory", it will work fine. but whenever you will include memory, compiler gives error as it already has been declared in memory header file
Yes, they can increase binary size because of extern unused variables.
//---- in unused includes ----
extern int /* or a big class */ unused_var;
//---- in third party library ----
int unused_var = 13;

Multiple inclusion of header files leads to longer compile time?

Does including the same header files multiple times increase the compilation time?
For example, suppose every file in my project uses <iostream> <string> <vector> and <algorithm>. And if I include a lot of files in my source code, then does that increase the compile time?
I always thought that the guard headers served important purpose of avoiding double definitions but as a by product also eliminates double code.
Actually, someone I know proposed some ideas to remove such multiple inclusions. However, I consider them to be completely against the good design practices in c++. But was still wondering what might be the reasons of him to suggest the changes?
Most of these answers are wrong... For modern compilers, there is zero overhead for including the same file multiple times, assuming the header uses the usual "include guard" idiom.
The GCC preprocessor, for example, has special code to recognize the include guard idiom. It will not even open the header file (never mind reading it) for the second and subsequent #include directives.
I am not sure about other compilers, but I would be very surprised if most of them did not implement the same optimization.
Another technique besides precompiled headers is the compiler firewall idiom, explained here:
http://www.gotw.ca/publications/mill04.htm
http://www.gotw.ca/publications/mill05.htm
Every time #include <something.h> occurs in your source file, 'something.h' have to be found along the include path and read. But there is #ifndef _SOMETHING_H_ check, so the content of such something.h would not be compiled.
Thus there is some overhead, but it is really small.
If compile times were an issue, people used to use the optimisation recommended by Praetorian, originally recommened in Large Scale Software Design. However, most modern compilers automatically optimise for this case. For example, see the help from gcc
The best is to use precompiled headers. I do not know which compiler you are using, but most of them have this feature. I suggest you to refer to your compiler-manual on how to achieve this.
It basically collects all headerfiles and compiles it into a object file which then can be used by the linker. That speeds up compiling very much.
Minor Drawback:
You need to have 1 "uberheader" which is included in every compilation-unit (.cpp).
In that uberheader, only include static headers from libraries, not your own. Then the compiler does not need to recompile it very often.
It helps esp. when using header-only libraries such as boost or glm, eigen etc.
HTH
Yes, including the same header multiple times means that the file needs to be opened before the preprocessor guards kick in and prevent multiple definitions. The Mozilla source code uses the following trick to prevent this:
Foo.h
#ifndef FOO_H
#define FOO_H
// whatever
#endif /* FOO_H */
In all files that need to include foo.h
#ifndef FOO_H
#include "foo.h"
#endif
This prevents foo.h from having to be opened multiple times. Of course, this depends on everyone following a particular naming convention for their preprocessor guards.
You can't do this with standard headers, since there is no common naming convention for their preprocessor guards.
EDIT:
After reading your question again, I think you're asking about the same header being included in different source files. What I talked about above does not help with that. Each header file will still have to be opened and included at least once in every translation unit. The only way I know of to prevent this is to use precompiled headers, as #scorcher24 mentioned in his answer. But I'd stay away from this solution, because there is no standard way of generating precompiled headers across compilers, unless the compile times are absolutely prohibitive.
Some compilers, most notably Microsoft's, have a #pragma once directive that you can use to automatically skip an include file once it's already been included. This removes any performance penalty.
http://en.wikipedia.org/wiki/Pragma_once
It can be an issue. As others have said, most modern compilers
handle the case intelligently, and will only re-open the file in
degenerate cases. Most is not all, however, and one of the major
exceptions is Microsoft, which a lot of people do have to support. The
surest solution (if this is really a problem in your environment) is to
use the Lakos convention, putting the include guards around the
#include as well as in the header. This means, of course, a standard
convention for generating the guard names. (For external includes, wrap
them in your own header, which respects your local convention.)
Alternatively, you can use both the guards and #pragma once. The
guards will always work, and most compilers will avoid the extra opens,
and #pragma once will usually avoid the extra opens with Microsoft.
(#pragma once cannot be implemented reliably in complex networked
situation, but as long as all of your files are on your local drive,
it's quite reliable.)