Include File Ordering Strategy - c++

I've seen fairly consistent advice that an implementation file (.cc / .cpp) should include its corresponding class definition file first, before including other header files. But when the topic shifts to header files themselves, and the order of includes they contain, the advice seems to vary.
Google coding standards suggest:
dir2/foo2.h (preferred location — see details below).
C system files.
C++ system files.
Other libraries' .h files.
Your project's .h files.
It is unclear what the difference is between entry 1 and 5 above, and why one or the other location would be chosen. That said, another online guide suggests this order (found in the "Class Layout" section of that doc):
system includes
project includes
local includes
Once again there is an ambiguity, this time between items 2 and 3. What is the distinction? Do those represent inter-project and intra-project includes?
But more to the point, it looks as if both proposed coding standards are suggesting "your" header files are included last. Such advice, being backwards from what is recommended for include-ordering in implementation files, is not intuitive. Would it not make sense to have "your" header files consistently listed first - ahead of system and 3rd party headers?

The order you list your includes shouldn't matter from a technical point of view. If you designed it right, you should be able to put them in any order you want and it will still work. For example, if your foo.h needs <string>, it should be included inside your foo.h so you don't have to remember that dependency everywhere you use foo.
That being said, if you do have order dependencies, most of the time putting your definition file last will fix it. That's because foo.h depends on <string>, but not the other way around.
You might think that makes a good case for putting your definition file last, but it's actually quite the opposite. If your coding standards require the definition first, your compiler is more likely to catch incorrect order dependencies when they are first written.

I'm not aware of any verbatim standard but as a general rule of thumb include as little headers as possible especially within other header files to reduce compile times, conflicts, and dependencies. I'm a fan of using forward declaration of classes in header files and only including the header and definition on the .cpp side whenever I can afford to do so.
That said my personal preference is below:
For Headers:
C++ headers
3rd party headers
other project headers
this project's headers
For Source:
precompiled header file
this source file's header
C++ headers
3rd party headers
other project headers
this project's headers
Pointers or suggestions are usually to avoid conflicts and circular references, otherwise it's all personal preference or whatever policy you prefer adhere to for collaborative projects.

Regarding Google's style:
There is no ambiguity, at all.
The first header included should be the header related to this source file, thus in position 1. This way you make sure that it includes anything it needs and that there is no "hidden" dependency: if there is, it'll be exposed right away and prevent compilation.
The other headers are ordered from those you are the least likely to be able to change if an issue occurs to those you are the more likely to. An issue could be either an identifier clash, a macro leaking, etc...
By definition the C and C++ systems headers are very rarely altered, simply because there's so many people using them, thus they come second.
3rd party code can be changed, but it's generally cumbersome and takes time, thus they come third.
The "project includes" refer to project-wide includes, generally home-brawn libraries (middle-ware) that are used by several projects. They can be changed, but this would impact the other projects as well, they come fourth.
And finally the "local includes", that is those files who are specific to this project and can be changed without affecting anyone else. In case of issue, those are prime candidates, they come last.
Note that you can in fact have many more layers (especially in a software shop), the key idea is to order the dependencies starting from the bottom layer (system libs) to the top layer.
Within a given layer, I tend to organize them by alphabetical order, because it's easier to check them.

For Headers:
this project's headers
other project headers
3rd party headers
C++ headers
For Source:
this source file's header
this project's headers
other project headers
3rd party headers
C++ headers
This orders minimize the chance to MISS some required header inside .hpp file. Also it minimize INTERSECTIONS of 3rd party headers etc. And all hpp modules compiles with minimum required dependensis.
for example:
->test.hpp
// missing #include <string> header
void test(std::string& s);
->test.cpp
#include <string>
#include "test.hpp"
// we hide bug with missing required header
->test2.cpp
#include "test.hpp"
#include <string>
// compilation error with missing header

Related

Are there any performance implications to including every header?

Lets say I want to use hex() function. I know it is defined in <ios> header and I also know that it is included in <iostream> header. The difference is that in <iostream> are much more functions and other stuff I don't need.
From a performance stand-point, should I care about including/defining less functions, classes etc. than more?
There is no run time performance hit.
However, there could be excessive compile time hit if tons of unnecessary headers are included.
Also, when this is done, you can create unnecessary recompiles if, for instance, a header is changed but a file that doesn't use it includes it.
In small projects (with small headers included), this doesn't matter. As a project grows, it may.
If the standard says it is defined in header <ios> then include header <ios> because you can't guarantee it will be included in/through any other header.
TL;DR: In general, it is better to only include what you need. Including more can have an adverse effect on binary size and startup (should be insignificant), but mostly hurts compilation-time without precompiled headers.
Well, naturally you have to include at least those headers together guaranteed to cover all your uses.
It might sometimes happen to "work" anyway, because the standard C++ headers are all allowed to include each other as the implementer wants, and the headers are allowed to include additional symbols in the std-namespace anyway (see Why is "using namespace std" considered bad practice?).
Next, sometimes including an additional header might lead to creation of additional objects (see std::ios_base::Init), though a well-designed library minimizes such (that is the only instance in the standard library, as far as I know).
But the big issue isn't actually size and efficiency of the compiled (and optimized) binary (which should be unaffected, aside from the previous point, whose effect should be miniscule), but compilation-time while actively developing (see also How does #include <bits/stdc++.h> work in C++?).
And the latter is (severely, so much that the comittee is working on a modules-proposal, see C++ Modules - why were they removed from C++0x? Will they be back later on?) adversely affected by adding superfluous headers.
Unless, naturally, you are using precompiled-headers (see Why use Precompiled Headers (C/C++)?), in which case including more in the precompiled headers and thus everywhere instead of only where needed, as long as those headers are not modified, will actually reduce compile-times most of the time.
There is a clang-based tool for finding out the minimum headers, called include-what-you-use.
It analyzes the clang AST to decide that, which is both a strength and a weakness:
You don't need to teach it about all the symbols a header makes available, but it also doesn't know whether things just worked out that way in that revision, or whether they are contractual.
So you need to double-check its results.
Including unnecessary headers has following downsides.
Longer compile time, linker has to remove all the unused symbols.
If you have added extra headers in CPP, it will only affect your code.
But if you are distributing your code as a library and you have added unnecessary headers in your header files. Client code will be burdened with locating the headers that you have used.
Do not trust indirect inclusion, use the header in which required function is actually defined.
Also in a project as a good programming practice headers should be included in order of reducing dependency.
//local header -- most dependent on other headers
#include <project/impl.hpp>
//Third party library headers -- moderately dependent on other headers
#include <boost/optional.hpp>
//standard C++ header -- least dependent on other header
#include <string>
And things that won't be affected is run-time, linker will get rid of unused symbols during compilation.
Including unneeded header files has some value.
It does take less coding effort to include a cut and paste of the usually needed includes. Of course, later coding is now encumbered with not knowing what was truly needed.
Especially in C, with its limited name space control, including unneeded headers promptly detects collisions. Say code defined a global non-static variable or function that happened to match the standard, like erfc() to do some text processing. By including <math.h>, the collision is detected with double erfc(double x), even though this .c file does no FP math yet other .c files do.
#include <math.h>
char *erfc(char *a, char *b);
OTOH, had this .c file not included <math.h>, at link time, the collision would be detected. The impact of this delayed notice could be great if the code base for years did not need FP math and now does, only to detect char *erfc(char *a, char *b) used in many places.
IMO: Make a reasonable effort to not include unneeded header files, but do not worry about including a few extra, especially if they are common ones. If an automated method exist, use it to control header file inclusion.

Should I include every header?

Should I include every header even if it was included before? Or maybe I should avoid it when I can?
For example. If I use std::string and std::vector in some file. If <string> included <vector> should I include only <string> or <string> and <vector>?
TLDR
If you use it, include it.
The longer version...
If you use a header related entity (e.g. some type) in a file, you should include the related header for it. Don't rely on headers to include each other. If you use it, include it.
The C++ standard library doesn't mandate inclusion of <string> in <vector> nor vice-versa. Trying to make use of functionality like this would limit the code to a specific implementation. In general, the standard library headers may or may not include other headers (or their own internal headers) in an unspecified order or manner. One notable exception is <initializer_list> which is required to be included in a few of the other standard headers. Changes to this unspecified order or manner can also happen, thus breaking previously compiling code with the updated compiler or an updated standard library implementation (this has been known to happen).
Also consider that if the header file is the definition for the class, then it should include what is required for the definition of that class. The associated .cpp should include its associated .h and the remaining files required to implement the class. Don't need it, don't include it; don't include more than needed (llvm style guide). One exception here is templates (that don't have an associated .cpp); this exception would apply to other header only implementations.
It is noted that maintenance of the include what you use can be difficult in the long run; thus it makes sense that it is important in the beginning of the coding cycle to include what is required for the interface; and then again to check the includes with any reasonable change that is made to the code.
There seems to be some progress w.r.t. tools in this regard, such as the iwyu project, that uses the clang tool chain and seems to have support for msvc as well.
One counter example would be if the reason for the header is to include other headers, then maybe, but even then I would be very careful - make sure it is clearly defined what it includes. An example of this could be a precompiled header.
Generally, you should treat header dependencies as part of implementation, not as part of interface.
You should not rely on headers including other headers. If your class needs to use a std::vector, include <vector>; if you need std::string, include <string>. Otherwise you set yourself up for unexpected breakdowns when headers that used to include a file would suddenly stop including it, because they no longer need it.

Include directives in header file? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
where should “include” be put in C++
Obviously, there are two "schools of thought" as to whether to put #include directives into C++ header files (or, as an alternative, put #include only into cpp files). Some people say it's ok, others say it only causes problems. Does anybody know whether this discussion has reached a conclusion what is to be preferred?
I am not aware of any schools of thoughts concerning this. Put them in the header when they are needed there, otherwise forward declare and put them in the .cpp files that require them. There is no benefit in including headers where they are not needed.
What I found effective is following a few simple rules:
Headers shall be self-sufficient, i.e., they shall declare classes they need names for and include headers for any definition they use.
Headers should minimize dependencies as much as possible without violation the previous point.
Getting the first point rught is fairly easy: Include the header first thing from the source implementing what it declares. Getting the second point exactly right isn't trivial, though, and I think it requires tool support to get it exactly right. However, a few unnecessary dependencies generally aren't that bad.
As a rule of thumb, you don't include the headers in a header as long as full definition of them is necessary there. Most of the time you play around with pointers of classes in a header file so it's just fine to forward declare them there.
I think the issue was settle a long time ago: headers should be self-contained (that is should not depend on the user to have included other headers before -- that aspect is settle for so long that some aren't even aware there was a debate on this, but your put includes only in .cpp seems to hint at this) but minimal (i.e. should not include definitions when a declaration would be enough for self-containment).
The reason for self-containment is maintenance: should an header be modified and now depend on something new, you'd have to track all the place it is used to include the new dependency. BTW, the standard trick to ensure self-containment is to include the header providing the declarations for things defined in a .cpp first in the .cpp.
These are not schools of thought so much as religions. In reality, both approaches have their advantages and disadvantages, and there are certain practices to be followed for either approach to be successful. But only one of these approaches will "scale" to large projects.
The advantage of not including headers inside headers is faster compilation. However, this advantage does not come from headers being read only once, because even if you include headers inside headers, smart compilers can work that out. The speed advantage comes from the fact that you include only those headers which are strictly necessary for a given source file. Another advantage is that if we look at a source file, we can see exactly what its dependencies are: the flat list of header files gives that to us plainly.
However, this practice is hard to maintain, especially in large projects with many programmers. It's quite an inconvenience when you want to use module foo, but you cannot just #include "foo.h": you need to include 35 other headers.
What ends up happening is this: programmers are not going to waste their time discovering the exact, minimal set of headers that they need just to add module foo. To save time, they will go to some example source file similar to the one they are working on, and cut and paste all of the #include directives. Then they will try compiling it, and if it doesn't build, then they will cut and paste more #include directives from yet elsewhere, and repeat that until it works.
The net result is that, little by little, you lose the advantage of faster compiling, because your files are now including unnecessary headers. Moreover, the list of #include directives no longer shows the true dependencies. Moreover, when you do incremental compiles now, you compile more than is necessary due to these false dependencies.
Once every source file includes nearly every header, you might as well have a big everything.h which includes all the headers, and then #include "everything.h" in every source file.
So this practice of including just specific headers is best left to small projects that are carefully maintained by a handful of developers who have plenty of time to maintain the ethic of minimal include dependencies by hand, or write tools to hunt down unnecessary #include directives.

What is the best header structure to use in a library?

Concerning headers in a library, I see two options, and I'm not sure if the choice really matters. Say I created a library, lets call it foobar. Please help me choose the most appropriate option:
Have one include in the very root of the library project, lets call it foobar.h, which includes all of the headers in the library, such as "src/some_namespace/SomeClass.h" and so on. Then from outside the library, in the file that I want to use anything to do with the foobar library, just #include <foobar.h>.
Don't have a main include, and instead include only the headers we need in the places that I am to use them, so I may have a whole bunch of includes in a source file. Since I'm using namespaces sometimes as deep as 3, including the headers seems like a bit of a chore.
I've opted for option 1 because of how easy it is to implement. OpenGL and many other libraries seem to do this, so it seemed sensible. However, the standard C++ library can require me to include several headers in any given file, why didn't they just have one header file? Unless it's me being and idiot, and they're separate libraries...
Update:
Further to answers, I think it makes sense to provide both options, correct? I'd be pretty annoyed if I wanted to use a std::string but had to include a mass of header files; that would be silly. On the other hand, I'd be irritated if I had to type a mass of #include lines when I wanted to use most of a library anyway.
Forward headers:
Thanks to all that advised me of forward headers, this has helped me make the header jungle less complicated! :)
stl, boost and others who have a lot of header files to include they provide you with independent tools and you can use them independently.
So if you library is a set of uncoupling tools you have to give a choice to include them as separate parts as well as to include the whole library as the one file.
Think a bit about how your libary will be used, and organize it that way. If someone is unlikely to use one small part without using the whole thing, structure it as one big include. If a small part is independent and useful on its own, make sure you can include just enough for that part. If there's some logical grouping that makes sense, create include files for each group.
As with most programming questions, there's no one-size-fits-all answer.
All #included headers have to be processed. This isn't as bad as it could be, since modern compilers provide some sort of option for not processing them repeatedly (perhaps with something like #pragma once, or an ifndef guard). Still, every #included header has to be processed once for each translation unit, and that can add up fast.
The usual practice is for header files to #include only those header files they need, and to use forward declarations (class foo;) as much as possible. That way, you don't get the overhead.
If you want to #include everything and its brother, you can provide your own header file that #includes everything. You don't have to explicitly write everything out in every header and source file. That option is something you can provide, but if everything in std came as one monolithic header, you wouldn't have an option.
Every time you #include a header file you make the compiler do some pretty hard work. The fewer headers you #include, the less work it has to do and the faster your compilations will be.
All include files should have own sense. And you should choose header structure from lib-users positions: how users should use my library? what structure will best for users?
examples:
if you library provide string algorithms - it will be better make one header with all - string_algorithms.h;
if you library provide some one facade object - it will be better to use one header file ( maybe few other files with extensions or helpers );
if you provide complex of objects which will be used independently make different header files (containers lib provide different containers);
Forward declare instead of including all those header files at once, then include as and when you need.
However you decide on the header file(s) that you make available (one, several or some combination thereof) for the library's public API, it's always a good idea to have at least one separate header for the private API. (No need to expose the prototypes of the non-exported functions and classes or the definitions that are only intended to be used internally.)

C++ class header files organization

What are the C++ coding and file organization guidelines you suggest for people who have to deal with lots of interdependent classes spread over several source and header files?
I have this situation in my project and solving class definition related errors crossing over several header files has become quite a headache.
Some general guidelines:
Pair up your interfaces with implementations. If you have foo.cxx, everything defined in there had better be declared in foo.h.
Ensure that every header file #includes all other necessary headers or forward-declarations necessary for independent compilation.
Resist the temptation to create an "everything" header. They're always trouble down the road.
Put a set of related (and interdependent) functionality into a single file. Java and other environments encourage one-class-per-file. With C++, you often want one set of classes per file. It depends on the structure of your code.
Prefer forward declaration over #includes whenever possible. This allows you to break the cyclic header dependencies. Essentially, for cyclical dependencies across separate files, you want a file-dependency graph that looks something like this:
A.cxx requires A.h and B.h
B.cxx requires A.h and B.h
A.h requires B.h
B.h is independent (and forward-declares classes defined in A.h)
If your code is intended to be a library consumed by other developers, there are some additional steps that are important to take:
If necessary, use the concept of "private headers". That is, header files that are required by several source files, but never required by the public interface. This could be a file with common inline functions, macros, or internal constants.
Separate your public interface from your private implementation at the filesystem level. I tend to use include/ and src/ subdirectories in my C or C++ projects, where include/ has all of my public headers, and src/ has all of my sources. and private headers.
I'd recommend finding a copy of John Lakos' book Large-Scale C++ Software Design. It's a pretty hefty book, but if you just skim through some of his discussions on physical architecture, you'll learn a lot.
Check out the C and C++ coding standards at the NASA Goddard Space Flight Center. The one rule that I specially noted in the C standard and have adopted in my own code is the one that enforces the 'standalone' nature of header files. In the implementation file xxx.cpp for the header xxx.h, ensure that xxx.h is the first header included. If the header is not self-contained at any time, then compilation will fail. It is a beautifully simple and effective rule.
The only time it fails you is if you port between machines, and the xxx.h header includes, say, <pqr.h>, but <pqr.h> requires facilities that happen to be made available by a header <abc.h> on the original platform (so <pqr.h> includes <abc.h>), but the facilities are not made available by <abc.h> on the other platform (they are in def.h instead, but <pqr.h> does not include <def.h>). This isn't a fault of the rule, and the problem is more easily diagnosed and fixed if you follow the rule.
Check the header file section in Google style guide
Tom's answer is an excellent one!
Only thing I'd add is to never have "using declarations" in header files. They should only be allowed in implementation files, e.g. foo.cpp.
The logic for this is well described in the excellent book "Accelerated C++" (Amazon link - sanitised for script kiddie link nazis)
One more point in addition to the others here:
Don't include any private definitions
in an include file. E.g. any
definition that is only used in
xxx.cpp should be in xxx.cpp, not
xxx.h.
Seems obvious, but I see it
frequently.
I'd like to add one very good practice (both in C and C++) which is often forsaken :
foo.c
#include "foo.h" // always the first directive
Any other needed headers should follow, then code. The point is that you almost always need that header for this compilation unit anyway and including it as a first directive warrants the header remains self-sufficient (if it is not, there will be errors). This is especially true for public headers.
If at any point you need to put something before this header inclusion (except comments of course), then it is likely you're doing something wrong. Unless you really know what you are doing... which leads to another more crucial rule => comment your hacks !