Are unused includes harmful in C/C++? - c++

What are the negative consequences of unused includes?
I'm aware they result in increased binary size (or do they?), anything else?

Increases compilation time (potentially serious issue)
Pollutes global namespace.
Potential clash of preprocessor names.
If unused headers are included from third-party libraries, it may make such libraries unnecessarily maintained as dependencies.

They don't necessarily increase binary size, but will increase compile time.

The main problem is clutter. These are the three main aspects in which the clutter manifests:
Visual pollution; while you are trying to figure other includes that you do need.
Logical pollution; it is more likely to have collision of functions, more time to compile (it might be really small for a couple of includes, but if it becomes a "policy" to not cleaning up unneeded includes, it might become a significant hurdle).
Dependency opacity; since there are more headers to analyze is it harder to determine the dependency cycles in your code. Knowing what are the dependencies in your code is crucial when your codebase grows to any significant level beyond the hobbyist level.

Generally speaking, yes, it does cause some problems. Logically speaking, if you don't need it then don't include it.
Any singletons declared as external in a header and defined in a source file will be included in your program. This obviously increases memory usage and possibly contributes to a performance overhead by causing one to access their page file more often (not much of a problem now, as singletons are usually small-to-medium in size and because most people I know have 6+ GB of RAM).
Compilation time is increased, and for large commercial projects where one compiles often, this can cause a loss of money. It might only add a few seconds on to your total time, but multiply that by the several hundred compiles or so you might need to test and debug and you've got a huge waste of time which thus translates into a loss in profit.
The more headers you have, the higher the chance that you may have a prepossessing collision with a macro you defined in your program or another header. This can be avoided via correct use of namespaces but it's still such a hassle to find. Again, lost profit.
Contributes to code bloat (longer files and thus more to read) and can majorly increase the number of results you find in your IDE's auto complete tool (some people are religiously against these tools, but they do increase productivity to an extent).
You can accidentally link other external libraries into your program without even knowing it.
You may inadvertently cause the end of the world by doing this.

I'll assume the headers can all be considered as "sincere", that is, are not precisely written with the aim of sabotaging your code.
It will usually slow the compilation (pre-compiled headers will mitigate this point)
it implies dependencies where none really exist (this is a semantic errors, not an actual error)
macros will pollute your code (mitigated by the prefixing of macros with namespace-like names, as in BOOST_FOREACH instead of FOREACH)
an header could imply a link to another library. in some case, an unused header could ask the linker to link your code with an external library (see MSCV's #pragma comment(lib, "")). I believe a good linker would not keep the library's reference if it's not used (IIRC, MSVC's linker will not keep the reference of an unused library).
a removed header is one less source of unexpected bugs. if you don't trust the header (some coders are better than others...), then removing it removes a risk (you won't like including an header changing the struct alignment of everything after it : the generated bugs are... illuminating...).
an header's static variable declaration will pollute your code. Each static variable declaration will result in a global variable declared in your compiled source.
C symbol names will pollute your code. The declarations in the header will pollute your global or struct namespace (and more probably, both, as structs are usually typedef-ed to bring their type into the global namespace). This is mitigated by libraries prefixing their symbols with some kind of "namespace name", like SDL_CreateMutex for SDL.
non-namespaced C++ symbol names will pollute your code. For the same reasons above. The same goes for headers making wrong use of the using namespace statement. Now, correct C++ code will namespace its symbols. Yes, this means that you should usually not trust a C++ header declaring its symbols in the global namespace...

Whether or not they increase the binary size really depends on what's in them.
The main side-effect is probably the negative impact on compilation speed. Again, how big an impact depends on what's in them, how much and whether they include other headers.

Well for one leaving them there only prolong the compile time and adds unnecessary compilation dependencies.

They represent clumsy design.
If you are not sure what to include and what not to include, it shows the developer had no idea what he was doing.
Include files are meant to be included only when the are need. It may not be that much of issue as the computer memory and speed is growing by leaps and bounds these days but it was once perhaps.
If an include is not needed but included anyhow, I would recommend to put a comment next to it saying why you included it. If a new developers get on to your code, he will have much appreciation for you, if you have done it the right way.

include means you are adding some more declarations. So when you are writing your own global function, you need to be carefull wheather that function is already declaerd in the header included.
Ex. if you write your own class auto_ptr{} without including "memory", it will work fine. but whenever you will include memory, compiler gives error as it already has been declared in memory header file

Yes, they can increase binary size because of extern unused variables.
//---- in unused includes ----
extern int /* or a big class */ unused_var;
//---- in third party library ----
int unused_var = 13;

Related

The use of double include guards in C++

So I recently had a discussion where I work, in which I was questioning the use of a double include guard over a single guard. What I mean by double guard is as follows:
Header file, "header_a.hpp":
#ifndef __HEADER_A_HPP__
#define __HEADER_A_HPP__
...
...
#endif
When including the header file anywhere, either in a header or source file:
#ifndef __HEADER_A_HPP__
#include "header_a.hpp"
#endif
Now I understand that the use of the guard in header files is to prevent multiple inclusion of an already defined header file, it's common and well documented. If the macro is already defined, the entire header file is seen as 'blank' by the compiler and the double inclusion is prevented. Simple enough.
The issue I don't understand is using #ifndef __HEADER_A_HPP__ and #endif around the #include "header_a.hpp". I'm told by the coworker that this adds a second layer of protection to inclusions but I fail to see how that second layer is even useful if the first layer absolutely does the job (or does it?).
The only benefit I can come up with is that it outright stops the linker from bothering to find the file. Is this meant to improve compilation time (which was not mentioned as a benefit), or is there something else at work here that I am not seeing?
I am pretty sure that it is a bad practice to add another include guard like:
#ifndef __HEADER_A_HPP__
#include "header_a.hpp"
#endif
Here are some reasons why:
To avoid double inclusion it is enough to add a usual include guard inside the header file itself. It does the job well. Another include guard in the place of inclusion just messes the code and reduces readability.
It adds unnecessary dependencies. If you change include guard inside the header file you have to change it in all places where the header is included.
It is definitely not the most expensive operation comparing the whole compilation/linkage process so it can hardly reduce the total build time.
Any compiler worth anything already optimizes file-wide include-guards.
The reason for putting include guards in the header file is to prevent the contents of the header from being pulled into a translation unit more than once. That's normal, long-established practice.
The reason for putting redundant include guards in a source file is to avoid having to open the header file that's being included, and back in the olden days that could significantly speed up compilation. These days, opening a file is much faster than it used to be; further, compilers are pretty smart about remembering which files they've already seen, and they understand the include guard idiom, so can figure out on their own that they don't need to open the file again. That's a bit of hand-waving, but the bottom line is that this extra layer isn't needed any more.
EDIT: another factor here is that compiling C++ is far more complicated than compiling C, so it takes far longer, making the time spent opening include files a smaller, less significant part of the time it takes to compile a translation unit.
The only benefit I can come up with is that it outright stops the linker from bothering to find the file.
The linker will not be affected in any way.
It could prevent the pre-processor from bothering to find the file, but if the guard is defined, that means that it has already found the file. I suspect that if the pre-process time is reduced at all, the effect would be quite minimal except in the most pathologically recursively included monstrosity.
It has a downside that if the guard is ever changed (for example due to conflict with another guard), all the conditionals before the include directives must be changed in order for them to work. And if something else uses the previous guard, then the conditionals must be changed for the include directive itself to work correctly.
P.S. __HEADER_A_HPP__ is a symbol that is reserved to the implementation, so it is not something that you may define. Use another name for the guard.
Older compilers on more traditional (mainframe) platforms (we're talking mid-2000s here) did not used to have the optimisation described in other answers, and so it really did used to significantly slow down preprocessing time having to re-read header files that have already been included (bearing in mind in a big, monolithic, enterprise-y project you're going to be including a LOT of header files). As an example, I've seen data that indicates a 26-fold speedup for a file with 256 header files each including the same 256 header files on the VisualAge C++ 6 for AIX compiler (which dates from the mid-2000s). This is a rather extreme example but this sort of speed-up does add up.
However, all modern compilers, even on mainframe platforms such as AIX and Solaris, perform enough optimisation for header inclusion that the difference these days really is negligible. Therefore there is no good reason to have these any more.
This does, however, explain why some companies still hang on to the practice, because relatively recently (at least in C/C++ codebase age terms) it was still worthwhile for very large monolithic projects.
Although there are people arguing against it, in practice '#pragma once' works perfectly and the main compilers (gcc/g++, vc++) support it.
So whatever puristic argumentation people are spreading, it works a lot better:
Fast
No maintenance, no trouble with mysterious non-inclusion because you copied an old flag
Single line with obvious meaning versus cryptic lines spread in file
So simply put:
#pragma once
at the start of the file, and that's it. Optimized, maintainable, and ready to go.

Does storing code in header files cause memory management issues in C++?

Not so long ago one man told me that storing all the code in .h files produces some memory management issues. Because I get too many duplicates of one class. And I'll avoid this issue if I'll store a code in .h/.cpp files. Is that true?
I've already googled some info on this subject and I read that it's all just about habit. So what about memory problems in case of storing all the code in .h files?
... one man told me that storing all the code in .h files produces some memory management issues. [...] And I'll avoid this issue if I'll store a code in .h/.cpp files. Is that true?
Memory management usually refers to handling of dynamic memory at runtime. For clarity: Writing all your code in headers has nothing to do with that. However, doing so may increase the amount of memory that the compiler uses indeed. So, if that's what the one man meant, then yes, it could potentially be true - but it's not the biggest problem with the approach.
Because I get too many duplicates of one class
This is a silly argument. In fact, using only header files for definitions means that there is exactly one translation unit where the class definition is included exactly once.
The potential problem is that your single translation unit includes all definitions in all header files. Since everything is processed in one go, the maximum memory required by the compiler is potentially higher. It's probably insignificant to most projects, but for something like libreoffice, it could probably become a problem.
A bigger problem with defining all functions in headers - i.e. using a single translation unit - is that any change to any header, however small, will cause the single massive translation unit to change, and you'll be required to compile it again. With multiple, smaller, translation units, only ones that are affected by the change will need recompilation.
So, the problem with your approach will be that every recompilation is as slow as compiling from scratch. Sure, if the compilation takes a minute, it doesn't matter, but for projects that take hours to compile from scratch, this is essential.
Your "one man" doesn't know what he's talking about.
Use of header files can only (potentially) cause the compiler (or programs that implement phases of compilation) to consume more memory during compilation. That depends on how the compiler is implemented though. And a quality compiler will deal with it. If some compilation unit has enough code to cause a compiler to run out of memory then that's another concern .... and largely unrelated to use of header files.
If you write code that causes memory management concerns in your program, putting that code into header files makes no difference - bad code will cause a program to have memory management concerns (leaks, using dangling pointers, etc) regardless of whether parts of that code is in header files.
If header files are used carelessly, there are some situations where multiple object files can each contain a definition of some functions. Technically, this can increase memory usage of a program, but there are also coding techniques to mitigate that - albeit some details of the techniques may differ a little if using header files. In other words, it is writing poor code that makes the difference, not (on its own) putting code into header files.
There are other concerns with use of header files (or the preprocessor more generally). But there are benefits to using header files, and techniques to manage such concerns. And those concerns are practically unrelated to memory management issues anyway.
Practically, it is usually better to split code sensibly between header files and compilation units (aka .h and .cpp files). But there are also cases where putting function definitions for a library in header files are beneficial (e.g template libraries).

Are there any performance implications to including every header?

Lets say I want to use hex() function. I know it is defined in <ios> header and I also know that it is included in <iostream> header. The difference is that in <iostream> are much more functions and other stuff I don't need.
From a performance stand-point, should I care about including/defining less functions, classes etc. than more?
There is no run time performance hit.
However, there could be excessive compile time hit if tons of unnecessary headers are included.
Also, when this is done, you can create unnecessary recompiles if, for instance, a header is changed but a file that doesn't use it includes it.
In small projects (with small headers included), this doesn't matter. As a project grows, it may.
If the standard says it is defined in header <ios> then include header <ios> because you can't guarantee it will be included in/through any other header.
TL;DR: In general, it is better to only include what you need. Including more can have an adverse effect on binary size and startup (should be insignificant), but mostly hurts compilation-time without precompiled headers.
Well, naturally you have to include at least those headers together guaranteed to cover all your uses.
It might sometimes happen to "work" anyway, because the standard C++ headers are all allowed to include each other as the implementer wants, and the headers are allowed to include additional symbols in the std-namespace anyway (see Why is "using namespace std" considered bad practice?).
Next, sometimes including an additional header might lead to creation of additional objects (see std::ios_base::Init), though a well-designed library minimizes such (that is the only instance in the standard library, as far as I know).
But the big issue isn't actually size and efficiency of the compiled (and optimized) binary (which should be unaffected, aside from the previous point, whose effect should be miniscule), but compilation-time while actively developing (see also How does #include <bits/stdc++.h> work in C++?).
And the latter is (severely, so much that the comittee is working on a modules-proposal, see C++ Modules - why were they removed from C++0x? Will they be back later on?) adversely affected by adding superfluous headers.
Unless, naturally, you are using precompiled-headers (see Why use Precompiled Headers (C/C++)?), in which case including more in the precompiled headers and thus everywhere instead of only where needed, as long as those headers are not modified, will actually reduce compile-times most of the time.
There is a clang-based tool for finding out the minimum headers, called include-what-you-use.
It analyzes the clang AST to decide that, which is both a strength and a weakness:
You don't need to teach it about all the symbols a header makes available, but it also doesn't know whether things just worked out that way in that revision, or whether they are contractual.
So you need to double-check its results.
Including unnecessary headers has following downsides.
Longer compile time, linker has to remove all the unused symbols.
If you have added extra headers in CPP, it will only affect your code.
But if you are distributing your code as a library and you have added unnecessary headers in your header files. Client code will be burdened with locating the headers that you have used.
Do not trust indirect inclusion, use the header in which required function is actually defined.
Also in a project as a good programming practice headers should be included in order of reducing dependency.
//local header -- most dependent on other headers
#include <project/impl.hpp>
//Third party library headers -- moderately dependent on other headers
#include <boost/optional.hpp>
//standard C++ header -- least dependent on other header
#include <string>
And things that won't be affected is run-time, linker will get rid of unused symbols during compilation.
Including unneeded header files has some value.
It does take less coding effort to include a cut and paste of the usually needed includes. Of course, later coding is now encumbered with not knowing what was truly needed.
Especially in C, with its limited name space control, including unneeded headers promptly detects collisions. Say code defined a global non-static variable or function that happened to match the standard, like erfc() to do some text processing. By including <math.h>, the collision is detected with double erfc(double x), even though this .c file does no FP math yet other .c files do.
#include <math.h>
char *erfc(char *a, char *b);
OTOH, had this .c file not included <math.h>, at link time, the collision would be detected. The impact of this delayed notice could be great if the code base for years did not need FP math and now does, only to detect char *erfc(char *a, char *b) used in many places.
IMO: Make a reasonable effort to not include unneeded header files, but do not worry about including a few extra, especially if they are common ones. If an automated method exist, use it to control header file inclusion.

Are unnecessary include files an overhead?

I have seen a couple of questions on how to detect unnecessary #include files in a C++ project. This question has often intrigued me, but I have never found a satisfactory answer.
If there are some header files included which, are not being used in a c++ project, is that an overhead? I understand that it means that before compilation the contents of all the header files would be copied into the included source files and that would result in a lot of unnecessary compilation.
How far does this kind of overhead spread to the compiled object files and binaries?
Aren't compilers able to do some optimizations to make sure that this
kind of overhead is not transferred to the resulting object files and
binaries ?
Considering the fact, that I probably know nothing about compiler optimization, I still want to ask this, in case there is an answer.
As a programmer who uses a wide variety of c++ libraries for his work,
what kind of programming practices should I follow to keep avoiding
such overheads ? Is making myself intimately familiar with each
library's working the only way out ?
It does not affect the performance of the binary or even the contents of the binary file, for almost all headers. Declarations generate no code at all, inline/static/anonymous-namespace definitions are optimized away if they aren't used, and no header should include externally visible definitions (that breaks if the header is included by more than one translation unit).
As #T.C. points out, the exception are internally visible static objects with nontrivial constructors. iostream does this, for example. The program must behave as if the constructor is called, and the compiler usually doesn't have enough information to optimize the constructor away.
It does, however, affect how long compilation takes and how many files will be recompiled when a header is changed. For large projects, this is enough incentive to care about unnecessary includes.
Besides the obviously longer compile times, there might be other issues. The most important one IMHO is dependencies to external libraries. You don't want your program to depend on more libraries then necessary.
You also then need to install those libraries in every system you want to the program to build on. This can become a nightmare, especially when the next programmer needs to install some database client library although the program never uses a database.
Also, especially library headers often tend to define macros. Sometimes those macros have very generic names which will break you code or which are incompatible with other library headers you might actually need.
Of course any #include is an overhead. The compiler needs to parse that file.
So avoid them. Use forward declarations where ever possible.
It will speed up compilation. See Scott Myers book on the subject
The simple answer is YES its an overhead as far as the compilation is concerned but for runtime it is merely going to create any difference. Reason being lets say you add #include <iostream> (just for example) and assume that you are not using any of its function then g++ 4.5.2 has some additional 18,560 lines of code to process(compilation). But as far as the runtime overhead is concerned I hardly think that it creates a performance issue.
You can also refer Are unused includes harmful in C/C++? where I really liked this point made by David Young
Any singletons declared as external in a header and defined in a
source file will be included in your program. This obviously increases
memory usage and possibly contributes to a performance overhead by
causing one to access their page file more often (not much of a
problem now, as singletons are usually small-to-medium in size and
because most people I know have 6+ GB of RAM).

Does a file structure that is mostly Header files slow down anything besides compilation?

Does a file structure that is mostly header files (90% of your code being header-only) slow down anything besides compilation?
Some people argue that it could cause inlining of most code in case of speed optimizations and so processor would calculate wrong stats about instruction calls or something like that. Is anywhere shown that it or something similar would happen and so slow down application speed?
This is possibly a duplicate of Benefits of inline functions in C++?
The practical performance implication depends on many factors. I would not concern myself with it until you actually have a performance problem, in which case I'm sure bigger gains can be obtained by optimizing other things.
Don't keep all your code in headers - if you continue with this trend you will hate yourself later because you will be waiting for your compiler most of the time. LTO is a better approach if you are looking for similar optimizations, and has less of an impact on compile time.
Linking is a concern.
If your libraries are header dominant, then larger intermediate object files may need to be written then read. The linker will then have more symbols to analyze and deduplicate, and some symbols will remain as legal duplicates. This increases your I/O, bloats your binary size, and throws a lot more work at the linker.
One benefit of header dominance is that there tends to be fewer sources to compile and consequently fewer images/objects to link. So header only also has the potential to be faster in this regard (if used correctly).
If your library is going to be visible to many translations, then size and impact on linking should also be an important consideration.
Not performance but potential bug concern:
From Uses Guidelines : In C++ class member functions declared in class defenition body always get inlined. If class member function has static members this would lead to each inlined function instance having its own static member. This would lead to bugs.