Include directives in header file? [duplicate] - c++

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
where should “include” be put in C++
Obviously, there are two "schools of thought" as to whether to put #include directives into C++ header files (or, as an alternative, put #include only into cpp files). Some people say it's ok, others say it only causes problems. Does anybody know whether this discussion has reached a conclusion what is to be preferred?

I am not aware of any schools of thoughts concerning this. Put them in the header when they are needed there, otherwise forward declare and put them in the .cpp files that require them. There is no benefit in including headers where they are not needed.

What I found effective is following a few simple rules:
Headers shall be self-sufficient, i.e., they shall declare classes they need names for and include headers for any definition they use.
Headers should minimize dependencies as much as possible without violation the previous point.
Getting the first point rught is fairly easy: Include the header first thing from the source implementing what it declares. Getting the second point exactly right isn't trivial, though, and I think it requires tool support to get it exactly right. However, a few unnecessary dependencies generally aren't that bad.

As a rule of thumb, you don't include the headers in a header as long as full definition of them is necessary there. Most of the time you play around with pointers of classes in a header file so it's just fine to forward declare them there.

I think the issue was settle a long time ago: headers should be self-contained (that is should not depend on the user to have included other headers before -- that aspect is settle for so long that some aren't even aware there was a debate on this, but your put includes only in .cpp seems to hint at this) but minimal (i.e. should not include definitions when a declaration would be enough for self-containment).
The reason for self-containment is maintenance: should an header be modified and now depend on something new, you'd have to track all the place it is used to include the new dependency. BTW, the standard trick to ensure self-containment is to include the header providing the declarations for things defined in a .cpp first in the .cpp.

These are not schools of thought so much as religions. In reality, both approaches have their advantages and disadvantages, and there are certain practices to be followed for either approach to be successful. But only one of these approaches will "scale" to large projects.
The advantage of not including headers inside headers is faster compilation. However, this advantage does not come from headers being read only once, because even if you include headers inside headers, smart compilers can work that out. The speed advantage comes from the fact that you include only those headers which are strictly necessary for a given source file. Another advantage is that if we look at a source file, we can see exactly what its dependencies are: the flat list of header files gives that to us plainly.
However, this practice is hard to maintain, especially in large projects with many programmers. It's quite an inconvenience when you want to use module foo, but you cannot just #include "foo.h": you need to include 35 other headers.
What ends up happening is this: programmers are not going to waste their time discovering the exact, minimal set of headers that they need just to add module foo. To save time, they will go to some example source file similar to the one they are working on, and cut and paste all of the #include directives. Then they will try compiling it, and if it doesn't build, then they will cut and paste more #include directives from yet elsewhere, and repeat that until it works.
The net result is that, little by little, you lose the advantage of faster compiling, because your files are now including unnecessary headers. Moreover, the list of #include directives no longer shows the true dependencies. Moreover, when you do incremental compiles now, you compile more than is necessary due to these false dependencies.
Once every source file includes nearly every header, you might as well have a big everything.h which includes all the headers, and then #include "everything.h" in every source file.
So this practice of including just specific headers is best left to small projects that are carefully maintained by a handful of developers who have plenty of time to maintain the ethic of minimal include dependencies by hand, or write tools to hunt down unnecessary #include directives.

Related

The use of double include guards in C++

So I recently had a discussion where I work, in which I was questioning the use of a double include guard over a single guard. What I mean by double guard is as follows:
Header file, "header_a.hpp":
#ifndef __HEADER_A_HPP__
#define __HEADER_A_HPP__
...
...
#endif
When including the header file anywhere, either in a header or source file:
#ifndef __HEADER_A_HPP__
#include "header_a.hpp"
#endif
Now I understand that the use of the guard in header files is to prevent multiple inclusion of an already defined header file, it's common and well documented. If the macro is already defined, the entire header file is seen as 'blank' by the compiler and the double inclusion is prevented. Simple enough.
The issue I don't understand is using #ifndef __HEADER_A_HPP__ and #endif around the #include "header_a.hpp". I'm told by the coworker that this adds a second layer of protection to inclusions but I fail to see how that second layer is even useful if the first layer absolutely does the job (or does it?).
The only benefit I can come up with is that it outright stops the linker from bothering to find the file. Is this meant to improve compilation time (which was not mentioned as a benefit), or is there something else at work here that I am not seeing?
I am pretty sure that it is a bad practice to add another include guard like:
#ifndef __HEADER_A_HPP__
#include "header_a.hpp"
#endif
Here are some reasons why:
To avoid double inclusion it is enough to add a usual include guard inside the header file itself. It does the job well. Another include guard in the place of inclusion just messes the code and reduces readability.
It adds unnecessary dependencies. If you change include guard inside the header file you have to change it in all places where the header is included.
It is definitely not the most expensive operation comparing the whole compilation/linkage process so it can hardly reduce the total build time.
Any compiler worth anything already optimizes file-wide include-guards.
The reason for putting include guards in the header file is to prevent the contents of the header from being pulled into a translation unit more than once. That's normal, long-established practice.
The reason for putting redundant include guards in a source file is to avoid having to open the header file that's being included, and back in the olden days that could significantly speed up compilation. These days, opening a file is much faster than it used to be; further, compilers are pretty smart about remembering which files they've already seen, and they understand the include guard idiom, so can figure out on their own that they don't need to open the file again. That's a bit of hand-waving, but the bottom line is that this extra layer isn't needed any more.
EDIT: another factor here is that compiling C++ is far more complicated than compiling C, so it takes far longer, making the time spent opening include files a smaller, less significant part of the time it takes to compile a translation unit.
The only benefit I can come up with is that it outright stops the linker from bothering to find the file.
The linker will not be affected in any way.
It could prevent the pre-processor from bothering to find the file, but if the guard is defined, that means that it has already found the file. I suspect that if the pre-process time is reduced at all, the effect would be quite minimal except in the most pathologically recursively included monstrosity.
It has a downside that if the guard is ever changed (for example due to conflict with another guard), all the conditionals before the include directives must be changed in order for them to work. And if something else uses the previous guard, then the conditionals must be changed for the include directive itself to work correctly.
P.S. __HEADER_A_HPP__ is a symbol that is reserved to the implementation, so it is not something that you may define. Use another name for the guard.
Older compilers on more traditional (mainframe) platforms (we're talking mid-2000s here) did not used to have the optimisation described in other answers, and so it really did used to significantly slow down preprocessing time having to re-read header files that have already been included (bearing in mind in a big, monolithic, enterprise-y project you're going to be including a LOT of header files). As an example, I've seen data that indicates a 26-fold speedup for a file with 256 header files each including the same 256 header files on the VisualAge C++ 6 for AIX compiler (which dates from the mid-2000s). This is a rather extreme example but this sort of speed-up does add up.
However, all modern compilers, even on mainframe platforms such as AIX and Solaris, perform enough optimisation for header inclusion that the difference these days really is negligible. Therefore there is no good reason to have these any more.
This does, however, explain why some companies still hang on to the practice, because relatively recently (at least in C/C++ codebase age terms) it was still worthwhile for very large monolithic projects.
Although there are people arguing against it, in practice '#pragma once' works perfectly and the main compilers (gcc/g++, vc++) support it.
So whatever puristic argumentation people are spreading, it works a lot better:
Fast
No maintenance, no trouble with mysterious non-inclusion because you copied an old flag
Single line with obvious meaning versus cryptic lines spread in file
So simply put:
#pragma once
at the start of the file, and that's it. Optimized, maintainable, and ready to go.

Is there a tool to help reduce #include statements [duplicate]

Are there any tools that help organizing the #includes that belong at the top of a .c or .h file?
I was just wondering because I am reorganizing my code, moving various small function definitions/declarations from one long file into different smaller files. Now each of the smaller files needs a subset of the #includes that were at the top of the long file.
It's just annoying and error-prone to figure out all #includes by hand. Often the code compiles even though not all #includes are there. Example: File A uses std::vector extensively but doesn't include vector; but it currently includes some obscure other header which happens to include vector (maybe through some recursive includes).
VisualAssistX can help you jumping to the definition of a type. E.g. if you use a class MyClass in your source, you can click it, choose goto definition, and VisualAssistX opens the include file that contains the definition of this class (possibly Visual Studio can also do this, but at this point am so getting used to VisualAssistX, that I contribute every wonderful feature to VisualAssistX :-)). You can use this to find the include file necessary for your source code.
PC-Lint can do exactly the opposite. If you have an include file in your source that is not being used, PC-Lint can warn you about it, so that you know that the include file can be removed from the source (which will have a positive impact on your compilation time).
makedepend and gccmakedep
I have been dealing with this problem lately.
In our project we use C++ and have one X.h and one X.cpp file for each class X.
My strategy is the following:
(1) If A.h, where class A is declared, refers to a class B, then
I have to include header B.h.
If the declaration of class A contains only the type *B, then I only need a
forward declaration class B; in A.h. I might need to include B.h in A.cpp
(2) Using the above procedure, I move as many includes as possible from A.h to A.cpp. Then I try removing one include at a time and see if the .cpp file still compiles. This should allow to minimize the set of included files in the .cpp file, but I am not 100% sure if the result is minimal. I also think one can write a tool to do this automatically. I had started to write one for Visual Studio. I would be glad to know that there are such tools.
Note: maybe adding a proper module construct with well-defined import / export relations could be a desired addition for C++0x. It would make the task of reorganizing imports much easier and speed up compilation a lot.
Since this question has been asked a new tool has been created: include-what-you-use, it is based on clang, provides mappings to fake the existence of certain symbols (unique_ptr in memory, but actually defined in bits/unique_ptr.h in some standard headers and lets you provide your own mappings.
It provides nice diagnostics and automatic rewriting.
Now each of the smaller files needs a subset of the #includes that were at the top of the long file.
I do this using VisualAssistX. Firstly compile the file to see what is missing. Then you can use VisualAssistX's Add include function. So I just go and right-click on the functions or classes that I know need an include and hit Add Include. Recompile a couple of times to filter out new missing includes and it is done. I hope I wrote that understandably :)
Not perfect, but faster than doing it by hand.
I use the doxygen/dot-graphviz generated graphs to see how files are linked. Very handy, but not automatic, you have to visually check the graphs, then edit you code to remove unnecessary "#include" lines. Of course, not really suited for very-large projects (say > 100 files), where the graphs become unusable.

Include everything, Separate with "using"

I'm developing a C++ library. It got me thinking of the ways Java and C# handle including different components of the libraries. For example, Java uses "import" to allow use of classes from other packages, while C# simply uses "using" to import entire modules.
My questions is, would it be a good idea to #include everything in the library in one massive include and then just use the using directive to import specific classes and modules? Or would this just be down right crazy?
EDIT:
Good responses so far, here are a few mitigating factors which I feel add to this idea:
1) Internal #includes are kept as normal (short and to the point)
2) The file which includes everything is optionally supplied with the library to those who wish to use it3) You could optionally make the big include file part of the pre-compiled header
You're confusing the purpose of #include statements in C++. They do not behave like import statements in Java or using statements in C#. #include does what it says; namely, loads and parses the entire indicated file as part of the current translation unit. The reason for the separate includes is to not have to spend compilation time parsing the entire standard library in every file. In contrast, the statements you're trying to make #include behave like are merely for programmer organization purposes.
#include is for management of the compilation process; not for separating uses. (In fact, you cannot use seperate headers to enforce seperate uses because to do so would violate the one definition rule)
tl;dr -> No, you shouldn't do that. #include as little as possible. When your project becomes large, you'll thank yourself when you're not waiting many hours to compile your project.
I would personally recommend only including the headers when you need them to explicitly show which functionalities your file requires. At the same time, doing so will prevent you from gaining access to functionalities you might no necessarily want, e.g functions unrelated to the goal of the file. Sure, this is no big deal, but I think that it's easier to maintain and change code when you don't have access to unnecessary functions/classes; it just makes it more straightforward.
I might be downvoted for this, but I think you bring up an interesting idea. It would probably slow down compilation a bit, but I think the concept is neat.
As long as you used using sparingly — only for the namespaces you need — other developers would be able to get an idea of what classes were used in a file by glancing at the top. It wouldn't be as granular as seeing a list of #included files, but is seeing a list of included header files really very useful? I don't think so.
Just make sure that all of the header files all use inclusion guards, of course. :)
As said by #Billy ONeal, the main thing is that #include is a preprocessor directive that causes a "^C, ^V" (copy-paste) of code that leads to a compile time increase.
The best considered policy in C++ is to forward declare all possible classes in ".h" files and just include them in the ".cpp" file. It isolates dependencies, as a C/C++ project will be cascadingly rebuilt if a dependent include file is changed.
Of course M$ compilers and its precompiled headers tend to do the opposite, enclosing to what you suggest. But anyone that tried to port code across those compilers is well aware of how smelly it can go.
Some libraries like Qt make extensive use of forward declarations. Take a look on it to see if you like its taste.
I think it will be confusing. When you write C++ you should avoid making it look like Java or C# (or C :-). I for one would really wonder why you did that.
Supplying an include-all file isn't really that helpful either, as a user could easily create one herself, with the parts of the library actually used. Could then be added to a precompiled header, if one is used.

C++ header policy on large projects (redux)

I've read everything I could find on this topic, including a couple of very helpful discussions on this site, the NASA coding guidelines and Google C++ guidelines. I even bought the "physical C++ design" book recommended on here (sorry, forgot the name) and got some useful ideas from that. Most sources seem to agree - header files should be self-contained, i.e. they include what they need so that a cpp file could include the header without including any others and it would compile. I also get the point about forward declaring rather than including whenever possible.
That said, how about if foo.cpp includes bar.h and qux.h, but it turns out that bar.h itself includes qux.h? Should foo.cpp then avoid including qux.h? Pro: cleans up foo.cpp (less "noise"). Con: if someone changes bar.h to no longer include qux.h, foo.cpp mysteriously starts failing to compile. Also causes the dependency between foo.cpp and qux.h not to be obvious.
If your answer is "a cpp file should #include every header it needs", taken to its logical conclusion that would mean that pretty much every cpp file has to #include <string>, <cstddef> etc. since most code will end up using those, and if you're not supposed to rely on some other header including them, your cpp needs to include them explicitly. That seems like a lot of "noise" in the cpp files.
Thoughts?
Previous discussions:
What are some techniques for limiting compilation dependencies in C++ projects?
Your preferred C/C++ header policy for big projects?
How do I automate finding unused #include directives?
ETA: Inspired by previous discussions on here, I've written a Perl script to successively comment out each 'include' and 'using', then attempt to recompile the source file, to figure out what's not needed. I've also figured out how to integrate it with VS 2005, so you can double-click to go to "unused" includes. If anyone wants it let me know...very much experimental right now though.
If your answer is "a cpp file should #include every header it needs", taken to its logical conclusion that would mean that pretty much every cpp file has to #include <string>, <cstddef> etc. since most code will end up using those, and if you're not supposed to rely on some other header including them, your cpp needs to include them explicitly.
Yup. That's the way I prefer it.
If the 'noise' is too much to bear, it's OK to have a 'global' include file that contains the usual, common set of includes (like stdafx.h in a lot of Windows programs), and include that at the start of each .cpp file (that helps with precompiled headers, too).
I think that you should still include both files. This helps with maintaining the code.
// Foo.cpp
#include <Library1>
#include <Library2>
I can read that and easily see which libraries it uses. If Library2 used Library1, and it was transformed to this:
// Foo.cpp
#include <Library2>
But I still saw Library1 code, I might be slightly confused. It's not hard to guess that some other library must be including it, but it's still a thought process that has to occur.
Being explicit means I don't have to guess, even for an extra micro second of compilation.
I think you should include both until it becomes unbearable. Then refactor your classes and source files to reduce coupling, on grounds that if just listing all your dependencies is onerous, then you probably have too many dependencies...
A compromise might be to say that if something in bar.h contains a class definition (or function declaration) which necessarily requires another class definition from qux.h, then bar.h can be assumed/required to include qux.h, and the user of the API in bar.h needn't bother including both.
So, suppose the client wants to think of itself as "really" only being interested in the bar API, but has to use qux too, because it calls a bar function that takes or returns a qux object by value. Then it can just forget that qux has its own header and imagine that it's one big bar API defined in bar.h, with qux just being a part of that API.
This means you can't count on, for example, searching cpp files for mentions of qux.h to find all clients of qux. But I'd never rely on that anyway, since in general it's too easy to accidentally miss explicitly including a dependency that's already indirectly included, and never realise you haven't listed all the relevant headers. So you probably shouldn't in any case assume header lists are complete, but instead use doxygen (or gcc -M, or whatever) to get a complete list of dependencies, and search that.
how about if foo.cpp includes bar.h and qux.h, but it turns out that bar.h itself includes qux.h? Should foo.cpp then avoid including qux.h?
If foo.cpp uses anything from qux.h directly, then it should include this header itself. Otherwise, since bar.h needs qux.h, I would rely onbar.h including everything it needs.
Each header file provides definitions needed to use a service of some type (a class definition, some macros, whatever). If you are directly using the services exposed through a header file, include that header file even if another header file also includes it. On the other hand, if you are using those services only indirectly, only in the context of the other header file's services, don't include the header file directly.
In my opinion this isn't a big deal either way. The second inclusion of a header file doesn't get fully parsed; it's basically commented out for all but the first time it's included. As things change and you find out you were using a header file that you weren't directly including, it's not a big deal to add it.
One way to think about whether foo.cpp directly uses qux.h, is to think about what would happen if foo.cpp no longer needs to include bar.h. If foo.cpp would still need to include qux.h, then it should be listed explicitly. If it would no longer need anything from qux.h, it is probably no necessary to include it explicitly.
The "never optimize without profiling" rule applies here, I think.
(This answer is biased toward "performance" over "clutter"; the clutter can generally be cleaned up at will, admittedly it gets harder later, but the performance impacts you on a regular basis.)
Try it both ways, see if there's any significant speed improvement for your compiler, and only then consider removing "duplicate" headers -- because as other answers point out, there's a long-term maintainability penalty you take by removing the duplicates.
Consider getting something like Sysinternals FileMon to see if you're actually generating filesystem hits at all for those duplicate includes (most compilers won't, with correct header guards).
Our experience here indicates that actively seeking out headers that are completely unused (or could be, with proper forward declarations) and removing them is far more worthy of time and effort than figuring out the include chains for possible duplicates. And a good lint (splint, PC-Lint, etc) can help you with this determination.
Even more worthy of our time and effort was figuring out how to get our compiler to handle multiple compilation units per execution (almost a linear speedup for us, up to a point -- the compilation was utterly dominated by compiler startup).
After that, you may want to consider the "single big CPP" madness. It can be quite impressive.

What is the best header structure to use in a library?

Concerning headers in a library, I see two options, and I'm not sure if the choice really matters. Say I created a library, lets call it foobar. Please help me choose the most appropriate option:
Have one include in the very root of the library project, lets call it foobar.h, which includes all of the headers in the library, such as "src/some_namespace/SomeClass.h" and so on. Then from outside the library, in the file that I want to use anything to do with the foobar library, just #include <foobar.h>.
Don't have a main include, and instead include only the headers we need in the places that I am to use them, so I may have a whole bunch of includes in a source file. Since I'm using namespaces sometimes as deep as 3, including the headers seems like a bit of a chore.
I've opted for option 1 because of how easy it is to implement. OpenGL and many other libraries seem to do this, so it seemed sensible. However, the standard C++ library can require me to include several headers in any given file, why didn't they just have one header file? Unless it's me being and idiot, and they're separate libraries...
Update:
Further to answers, I think it makes sense to provide both options, correct? I'd be pretty annoyed if I wanted to use a std::string but had to include a mass of header files; that would be silly. On the other hand, I'd be irritated if I had to type a mass of #include lines when I wanted to use most of a library anyway.
Forward headers:
Thanks to all that advised me of forward headers, this has helped me make the header jungle less complicated! :)
stl, boost and others who have a lot of header files to include they provide you with independent tools and you can use them independently.
So if you library is a set of uncoupling tools you have to give a choice to include them as separate parts as well as to include the whole library as the one file.
Think a bit about how your libary will be used, and organize it that way. If someone is unlikely to use one small part without using the whole thing, structure it as one big include. If a small part is independent and useful on its own, make sure you can include just enough for that part. If there's some logical grouping that makes sense, create include files for each group.
As with most programming questions, there's no one-size-fits-all answer.
All #included headers have to be processed. This isn't as bad as it could be, since modern compilers provide some sort of option for not processing them repeatedly (perhaps with something like #pragma once, or an ifndef guard). Still, every #included header has to be processed once for each translation unit, and that can add up fast.
The usual practice is for header files to #include only those header files they need, and to use forward declarations (class foo;) as much as possible. That way, you don't get the overhead.
If you want to #include everything and its brother, you can provide your own header file that #includes everything. You don't have to explicitly write everything out in every header and source file. That option is something you can provide, but if everything in std came as one monolithic header, you wouldn't have an option.
Every time you #include a header file you make the compiler do some pretty hard work. The fewer headers you #include, the less work it has to do and the faster your compilations will be.
All include files should have own sense. And you should choose header structure from lib-users positions: how users should use my library? what structure will best for users?
examples:
if you library provide string algorithms - it will be better make one header with all - string_algorithms.h;
if you library provide some one facade object - it will be better to use one header file ( maybe few other files with extensions or helpers );
if you provide complex of objects which will be used independently make different header files (containers lib provide different containers);
Forward declare instead of including all those header files at once, then include as and when you need.
However you decide on the header file(s) that you make available (one, several or some combination thereof) for the library's public API, it's always a good idea to have at least one separate header for the private API. (No need to expose the prototypes of the non-exported functions and classes or the definitions that are only intended to be used internally.)