How to use #include directive correctly? - c++

Is there any material about how to use #include correctly?
I didn't find any C/C++ text book that explains this usage in detail.
In formal project, I always get confused in dealing with it.

The big one that always tripped me up was this:
This searches in the header path:
#include <stdio.h>
This searches in your local directory:
#include "myfile.h"
Second thing you should do with EVERY header is this:
myfilename.h:
#ifndef MYFILENAME_H
#define MYFILENAME_H
//put code here
#endif
This pattern means that you cannot fall over on redefining the headers in your compilation (Cheers to orsogufo for pointing out to me this is called an "include guard"). Do some reading on how the C compiler actually compiles the files (before linking) because that will make the world of #define and #include make a whole lot of sense to you, the C compiler when it comes to parsing text isn't very intelligent. (The C compiler itself however is another matter)

Check Large-Scale C++ Software Design from John Lakos if you have the money.
Google C++ coding guidelines also have some OK stuff.
Check Sutter Herb materials online (blog) as well.
Basically you need to understand where include headers are NOT required, eg. forward declaration. Also try to make sure that include files compiles one by one, and only put #includes in h files when it's a must (eg. templates).

So your compiler may support 2 unique search paths for include files:
Informally we could call the system include path and the user include path.
The #include <XX> searches the system include path.
The #include "XX" searches the user include path then the system include path.
Checking the draft standard n2521:
Section 16.2:
2 A preprocessing directive of the form
# include < h-char-sequence> new-line
searches a sequence of implementation-defined places for a header identified
uniquely by the specified sequence between the < and > delimiters, and
causes the replacement of that directive by the entire contents of the
header. How the places are specified or the header identified is
implementation-defined.
3 A preprocessing directive of the form
# include " q-char-sequence" new-line
causes the replacement of that directive by the entire contents of the
source file identified by the specified sequence between the " " delimiters.
The named source file is searched for in an implementation-defined manner.
If this search is not supported, or if the search fails, the directive is
reprocessed as if it read
# include < h-char-sequence> new-line
with the identical contained sequence (including > characters, if any)
from the original directive.
An example of this would by gcc
-isystem <dir> Add <dir> to the start of the system include path
-idirafter <dir> Add <dir> to the end of the system include path
-iwithprefix <dir> Add <dir> to the end of the system include path
-iquote <dir> Add <dir> to the end of the quote include path
-iwithprefixbefore <dir> Add <dir> to the end of the main include path
-I <dir> Add <dir> to the end of the main include path
To see where your gcc is searching do this:
g++ -v -E -xc++ /dev/null -I LOOK_IN_HERE
#include "..." search starts here:
#include <...> search starts here:
LOOK_IN_HERE
/usr/include/c++/4.0.0
/usr/include/c++/4.0.0/i686-apple-darwin9
/usr/include/c++/4.0.0/backward
/usr/local/include
/usr/lib/gcc/i686-apple-darwin9/4.0.1/include
/usr/include
/System/Library/Frameworks (framework directory)
/Library/Frameworks (framework directory)
End of search list.
So how do you use this knowledge.
There are several school of thought. But I always list my libraries from most specific to most general.
Example
File: plop.cpp
#include "plop.h"
#include "plop-used-class.h"
/// C Header Files
#include <stdio.h> // I know bad example but I drew a blank
/// C++ Header files
#include <vector>
#include <memory>
This way if the header file "plop-used-class.h" should have included <vector> this will be cought by the compiler. If I had put the <vector> at the top this error would have been hidden from the compiler.

In addition to the other comments, remember that you don't need to #include a header in another header if you only have a pointer or reference. E.g.:
Header required:
#include "Y.h"
class X
{
Y y; // need header for Y
};
Header not required:
class Y;
class X
{
Y* y; // don't need header for Y
};
//#include "Y.h" in .cpp file
The second example compiles faster and has less dependencies. This can be important in large code bases.

Just an addendum to Andy Brice's answer, you can also make do with forward declarations for function return values:
class Question;
class Answer;
class UniversityChallenge
{
...
Answer AskQuestion( Question* );
...
};
Here's a link to a question I asked a while back with some good answers http://bytes.com/groups/c/606466-forward-declaration-allowed.

Header files are C's way of separating interface and implementation. They are divided into two types: standard and user-defined header files.
A standard header file, such as string.h, allows us access to the functionality of an underlying C library. You should #include it in every .c file which uses the relevant functionality. Usually this uses brackets as in #include
A user-defined header file exposes your implementation of functions to other programmers or other parts of your C code. If you have implemented a module called rational.c for calculations with rational numbers, it should have a corresponding rational.h file for its public interface. Every file which uses the functionality should #include rational.h, and also rational.c should #include it. Usually this is done using #include "rational.h"
The part of compilation which does the #includes is called the C preprocessor. It mostly does text substitutions and pastes text.
Spence is correct in his pattern for preventing duplicate #includes, which mess up the namespace. This is the base of inclusion, GNU Make gives you lots more power, and lots more trouble as well.

Check out the discussion on using #include<filename.h>
and #include<filename> for C++ includes of C libraries.

Edit: Andy Brice also made this point in a briefer way.
Following up in null's answer, the most important thing to think about is where you put your #include's.
When you write a #include the preprocessor literally includes the contents of the file you list in the current file, including any #includes in those files as well. This can obviously lead to very large files at compile time (coad bloat), so you need to consider carefully if an #include is needed.
In a standard code file layout where you have a .h file for a class with the class and function declarations, and then a .cpp implementation file, you should be be careful of the number of #includes that go in the header file. This is because, every time you make a change to the header file, any files that also include it (that is, that use your class) will also have to be recompiled; if the header itself has lots of includes then every file that uses the class gets bloated significantly at compile time.
It is better to use forward declarations where possible, so that you can write the method signatures, etc. and then #include the relevant files in the .cpp file so that you can actually use the classes and structures that your code depends on.
//In myclass.h
class UtilClass; //Forward declaration of UtilClass - avoids having to #include untilclass.h here
class MyClass
{
MyClass();
~MyClass();
void DoSomethingWithUtils(UtilClass *util); //This will compile due to forward declaration above
};
//Then in the .cpp
#include utilclass.h
void MyClass::DoSomethingWithUtils(UtilClass *util)
{
util->DoSomething(); //This will compile, because the class definition is included locally in this .cpp file.
}

Is there any material about how to use #include correctly?
I'd strongly recommend section, SF: Source files, of the C++ Core Guidelines as a good starting point.
I didn't find any C/C++ text book that explains this usage in detail.
Much conventional wisdom on the topic of physical composition of C++ projects can likely be found in "Large-Scale C++ Software Design" by John Lakos.
In formal project, I always get confused in dealing with it.
You are in good company. Prior to C++20 modules, #include was the only practical way to compose C++ translation units from multiple files. It is a simple, limited facility through which the preprocessor essentially copy/pastes entire files into other files. The resultant compiler input is often huge, and work is commonly repeated from one translation unit to the next.

You use #include "yourfile.h" if yourfile.h is in the current working directory
and #include <yourfile.h> if the path to yourfile.h file was included in the C++ include directories (somewhere in configuration, example: c:\mylib\yourfile.h , the path c:\mylib\ has to be specified as an include directory)
Also you can include .cpp and .hpp (h plus plus).
There is a particular set of files that can be written like: #include <iostream> . For this to particular example work you need to specify using namespace std;
There is a very nice software that is integrated with microsoft's visual c++ , and shows the include paths. http://www.profactor.co.uk/includemanager.php

Related

System headers before other headers according to Google Style guide? [duplicate]

What order should include files be specified, i.e. what are the reasons for including one header before another?
For example, do the system files, STL, and Boost go before or after the local include files?
I don't think there's a recommended order, as long as it compiles! What's annoying is when some headers require other headers to be included first... That's a problem with the headers themselves, not with the order of includes.
My personal preference is to go from local to global, each subsection in alphabetical order, i.e.:
h file corresponding to this cpp file (if applicable)
headers from the same component,
headers from other components,
system headers.
My rationale for 1. is that it should prove that each header (for which there is a cpp) can be #included without prerequisites (terminus technicus: header is "self-contained"). And the rest just seems to flow logically from there.
The big thing to keep in mind is that your headers should not be dependent upon other headers being included first. One way to insure this is to include your headers before any other headers.
"Thinking in C++" in particular mentions this, referencing Lakos' "Large Scale C++ Software Design":
Latent usage errors can be avoided by ensuring that the .h file of a component parses by itself – without externally-provided declarations or definitions... Including the .h file as the very first line of the .c file ensures that no critical piece of information intrinsic to the physical interface of the component is missing from the .h file (or, if there is, that you will find out about it as soon as you try to compile the .c file).
That is to say, include in the following order:
The prototype/interface header for this implementation (ie, the .h/.hh file that corresponds to this .cpp/.cc file).
Other headers from the same project, as needed.
Headers from other non-standard, non-system libraries (for example, Qt, Eigen, etc).
Headers from other "almost-standard" libraries (for example, Boost)
Standard C++ headers (for example, iostream, functional, etc.)
Standard C headers (for example, cstdint, dirent.h, etc.)
If any of the headers have an issue with being included in this order, either fix them (if yours) or don't use them. Boycott libraries that don't write clean headers.
Google's C++ style guide argues almost the reverse, with really no justification at all; I personally tend to favor the Lakos approach.
I follow two simple rules that avoid the vast majority of problems:
All headers (and indeed any source files) should include what they need. They should not rely on their users including things.
As an adjunct, all headers should have include guards so that they don't get included multiple times by over-ambitious application of rule 1 above.
I also follow the guidelines of:
Include system headers first (stdio.h, etc) with a dividing line.
Group them logically.
In other words:
#include <stdio.h>
#include <string.h>
#include "btree.h"
#include "collect_hash.h"
#include "collect_arraylist.h"
#include "globals.h"
Although, being guidelines, that's a subjective thing. The rules on the other hand, I enforce rigidly, even to the point of providing 'wrapper' header files with include guards and grouped includes if some obnoxious third-party developer doesn't subscribe to my vision :-)
To add my own brick to the wall.
Each header needs to be self-sufficient, which can only be tested if it's included first at least once
One should not mistakenly modify the meaning of a third-party header by introducing symbols (macro, types, etc.)
So I usually go like this:
// myproject/src/example.cpp
#include "myproject/example.h"
#include <algorithm>
#include <set>
#include <vector>
#include <3rdparty/foo.h>
#include <3rdparty/bar.h>
#include "myproject/another.h"
#include "myproject/specific/bla.h"
#include "detail/impl.h"
Each group separated by a blank line from the next one:
Header corresponding to this cpp file first (sanity check)
System headers
Third-party headers, organized by dependency order
Project headers
Project private headers
Also note that, apart from system headers, each file is in a folder with the name of its namespace, just because it's easier to track them down this way.
I recommend:
The header for the .cc module you're building. (Helps ensure each header in your project doesn't have implicit dependencies on other headers in your project.)
C system files.
C++ system files.
Platform / OS / other header files (e.g. win32, gtk, openGL).
Other header files from your project.
And of course, alphabetical order within each section, where possible.
Always use forward declarations to avoid unnecessary #includes in your header files.
I'm pretty sure this isn't a recommended practice anywhere in the sane world, but I like to line system includes up by filename length, sorted lexically within the same length. Like so:
#include <set>
#include <vector>
#include <algorithm>
#include <functional>
I think it's a good idea to include your own headers before other peoples, to avoid the shame of include-order dependency.
This is not subjective. Make sure your headers don't rely on being #included in specific order. You can be sure it doesn't matter what order you include STL or Boost headers.
First include the header corresponding to the .cpp... in other words, source1.cpp should include source1.h before including anything else. The only exception I can think of is when using MSVC with pre-compiled headers in which case, you are forced to include stdafx.h before anything else.
Reasoning: Including the source1.h before any other files ensures that it can stand alone without it's dependencies. If source1.h takes on a dependency on a later date, the compiler will immediately alert you to add the required forward declarations to source1.h. This in turn ensures that headers can be included in any order by their dependants.
Example:
source1.h
class Class1 {
Class2 c2; // a dependency which has not been forward declared
};
source1.cpp
#include "source1.h" // now compiler will alert you saying that Class2 is undefined
// so you can forward declare Class2 within source1.h
...
MSVC users: I strongly recommend using pre-compiled headers. So, move all #include directives for standard headers (and other headers which are never going to change) to stdafx.h.
Include from the most specific to the least specific, starting with the corresponding .hpp for the .cpp, if one such exists. That way, any hidden dependencies in header files that are not self-sufficient will be revealed.
This is complicated by the use of pre-compiled headers. One way around this is, without making your project compiler-specific, is to use one of the project headers as the precompiled header include file.
Several separate considerations are conflated when deciding for a particular include order. Let try to me untangle.
1. check for self-containedness
Many answers suggest that the include order should act as a check that your headers are self-contained. That mixes up the consideration of testing and compilation
You can check separately whether your headers are self-included. That "static analysis" is independent of any compilation process. For example, run
gcc headerfile.h -fsyntax-only
Testing whether your header files are self-contained can easily be scripted/automated. Even your makefile can do that.
No offense but Lakos' book is from 1996 and putting those different concerns together sounds like 90s-style programming to me. That being said, there are ecosystems (Windows today or in the 90s?) which lack the tools for scripted/automated tests.
2. Readability
Another consideration is readability. When you look up your source file, you just want to easily see what stuff has been included. For that your personal tastes and preferences matter most, though typically you either order them from most specific to least specific or the other way around (I prefer the latter).
Within each group, I usually just include them alphabetically.
3. Does the include order matter?
If your header files are self-contained, then the include order technically shouldn't matter at all for the compilation result.
That is, unless you have (questionable?) specific design choices for your code, such as necessary macro definitions that are not automatically included. In that case, you should reconsider your program design, though it might work perfectly well for you of course.
It is a hard question in the C/C++ world, with so many elements beyond the standard.
I think header file order is not a serious problem as long as it compiles, like squelart said.
My ideas is: If there is no conflict of symbols in all those headers, any order is OK, and the header dependency issue can be fixed later by adding #include lines to the flawed .h.
The real hassle arises when some header changes its action (by checking #if conditions) according to what headers are above.
For example, in stddef.h in VS2005, there is:
#ifdef _WIN64
#define offsetof(s,m) (size_t)( (ptrdiff_t)&(((s *)0)->m) )
#else
#define offsetof(s,m) (size_t)&(((s *)0)->m)
#endif
Now the problem: If I have a custom header ("custom.h") that needs to be used with many compilers, including some older ones that don't provide offsetof in their system headers, I should write in my header:
#ifndef offsetof
#define offsetof(s,m) (size_t)&(((s *)0)->m)
#endif
And be sure to tell the user to #include "custom.h" after all system headers, otherwise, the line of offsetof in stddef.h will assert a macro redefinition error.
We pray not to meet any more of such cases in our career.

Are header file names such as bits/vector.tcc standard compliant?

In my code base, I 'hide' implementation details of heavily templated code in .tcc files inside a bits sub-directory, i.e.
// file inc/foo.h:
#ifndef my_foo_h // include guard
#define my_foo_h
namespace my {
/* ... */ // templated code available for user
}
#include "bits/foo.tcc" // includes implementation details
namespace my {
/* ... */ // more templated code using details from foo.tcc
}
#endif
// file inc/bits/foo.tcc:
#ifndef my_foo_tcc // include guard
#define my_foo_tcc
#ifndef my_foo_h
# error foo.tcc must be #included from foo.h
#endif
namespace my { namespace details {
/* ... */ // defails needed in foo.h
} }
#endif
Of course, there must only be one file bits/foo.tcc in the include path. Otherwise, there will be a clash and (hopefully) a compilation error. This just happened to me with bits/vector.tcc, which is included from gcc's (4.8) vector but also my own header (using #include "bits/vector.tcc" and not #include <bits/vector.h>).
My question: is this formally a bug of gcc (since it uses a name bits/vector.tcc which is not protected by the standard) or correct, i.e. even formally my fault? If the latter, what names for header files are guaranteed to be okay to use?
(note I don't want to hear obvious advices of how to avoid this).
Edit The problem is that the header file vector provided by the standard library (shipped by the compiler) has a preprocessor directive #include <bits/vector.tcc> which causes the preprocessor to load my file rather than that provided with the standard library.
Here's what the C++11 standard [cpp.include] has to say about this:
1 A #include directive shall identify a header or source file that can be processed by the implementation.
2 A preprocessing directive of the form
# include < h-char-sequence> new-line
searches a sequence of implementation-defined places for a header identified uniquely by the specified sequence
between the < and > delimiters, and causes the replacement of that directive by the entire contents
of the header. How the places are specified or the header identified is implementation-defined.
3 A preprocessing directive of the form
# include " q-char-sequence" new-line
causes the replacement of that directive by the entire contents of the source file identified by the specified
sequence between the " delimiters. The named source file is searched for in an implementation-defined
manner. If this search is not supported, or if the search fails, the directive is reprocessed as if it read
# include < h-char-sequence> new-line
with the identical contained sequence (including > characters, if any) from the original directive.
In other words, #include < > is intended for searching for headers only. A header is one of the things provided by the standard library. I say "things" because the standard doesn't specify what it is - it doesn't have to a file at all (although all compilers I know implement headers as files).
#include " " is intended for "everything else" - in terms of the standard, they're all "source files," although in general speech we usually refer to files intended for being #included as "header files." Also note that if no such source file is found, a (standard library) header will be searched for instead.
So, in your case:
The standard doesn't say anything about files like bits/vector.tcc; in fact, it doesn't say anything about any files. All of this falls under the "implementation-defined" heading as is thus up to your compiler and its documentation.
At the same time (thanks to #JamesKanze for pointing this out in the comments), the standard clearly specifies what #include <vector> should do, and never mentions that it could depend on a file's presence or absence. So in this regard, gcc loading your bits/vector.tcc instead of its own is a gcc bug. If gcc loaded its own bits/vector.tcc instead of yours, it would be within its "implementation-defined" scope.
#include "vector" is primarily intended to include a source file named vector. However, if no such file is found, the effect is the same as including the standard header <vector> (which causes class template std::vector to be considered defined).
The standard is pretty open, but... including <vector> should
work; I don't see anything that authorizes it not to (provided
you've done #include <vector>, and not #include "vector"),
regardless of the names your personal includes.
More generally, the more or less universal algorithm for
searching for a header is to first search in the directory which
contains the file which does the include. This is done
precisely to avoid the type of problems you have encountered.
Not doing this (or not using some other mechanism to ensure that
includes from standard headers find the file they're supposed
to) is an error in the compiler. A serious one, IMHO. (Of
course, the compiler may document that certain options introduce
certain restrictions, or that you need to use certain options
for it to behave in a standard manner. I don't think that g++
documents -I as being incompatible with the standard headers,
but it does say that if you use -iquote, it shouldn't
influence anything included using <...>.)
EDIT:
The second paragraph above really only applies to the "..."
form of the include. #include <vector> should find the
standard header, even if you have a file vector in the same
directory as the file you are compiling.
In the absense of -I options, this works. Universally,
however, the -I option adds the directory in the search lists
for both types of include. The reason for this is that you,
as a developer, will probably want to treat various third party
libraries (e.g. X-Windows) as if they were part of the system as
well. (I think Linux does put X-Windows as part of the system,
putting its headers in /usr/include, but this wasn't the usual
case in other Unices in the past.) So you use -I to specify
them, as well as you're other include directories. And if you
have a file vector in one of your other directories, it will
"override" the system one.
This is clearly a flaw: if I recall correctly (but it's been
some time), g++ at one time did have additional options to put
a directory in the list for only one type of include. And in
modern gcc/g++, there's -iquote (and -I-, which specifies
that all of the earlier -I options are for the "..."
includes only). These features are little used, however,
because gcc/g++ is the only compiler which supported them.
Given all this, the gcc/g++ handling is probably the best you
can hope for. And the error isn't in the compiler itself, but
the library headers, which use <bits/vector.tcc> when it
absolutely wants the include file from the same directory as the
file doing the including. (Another way of saying this is that
bits/vector.tcc isn't a system header, in any sense of the
word, but an implementation header of system library.)
None of which helps the original poster much, unless he feels
like modifying the library headers for g++. (If portability
isn't any issue, and he's not considering his headers as part of
the system, he could change the -I to -iquote.)

Get all include files of a cpp file considering preprocessor defines (fast)

I need a tool (command line, script or source code) that extracts all inlcude files that are included by a source file (recursive) with given preprocessor defines and include paths. I want to know the ones that could be found and the one that doesn't. The include files that could be found shall be recursivly parsed.
I know about the
gcc -M /-MM
cl /P
solution, but this does not work for me. The preprocessor stops as soon as it could not open a file. But at this time I don't have the correct path for that files and just want the preprocessor to skip that file and to tell me that it could not include that file
Also the cinclude2dot.pl from here is not useful, because it seems not to consider given preprocessor defines.
Very useful is the include file hierarchy finder from CodeProject. It considers the preprocessor flags and shows me all include files. Even the includes that couldn't be opened. But it is written in MFC and I would have to reimplement this for the gcc what is not such simple because a lot of WinAPI stuff is used even inside the parser.
Thus, maybe some one knows another solution.
an simple example:
main.cpp
#include <iostream>
#include <string>
#include <boost/foreach.hpp>
#include <SharedClass.h>
#include "MyClass.h"
#ifdef FOO
#include <OptClass.h>
#endif
int main() {}
Right now I start the include extraction like (simplified):
.getAllIncludes main.cpp -I includepath1;includepath2 -PD FOO
and obtain:
cannot open //-> I don't care right now, it's a default header
cannot open // -> here I can extract the info that I need boost
SharedClass.h
SharedDependenyClass.h //a header that is included by SharedClass...
MyClass.h
TestClass.h //a header that is included by the MyClass header...
TestClass2.h //a header that is included by the TestClass header...
OptClass.h
and for
.getAllIncludes main.cpp -I includepath1;includepath2
I'll obtain:
cannot open //-> I don't care right now, it's a default header
cannot open // -> here I can extract the info that I need boost
SharedClass.h
SharedDependenyClass.h //a header that is included by SharedClass...
MyClass.h
TestClass.h //a header that is included by the MyClass header...
TestClass2.h //a header that is included by the TestClass header...
I know that the deafault header may also define some values. But in my case I don't need that information, because the project source code doesn't depend on any of that defines. If thus, I feed my tool with this preprocessor define...
In the end the tool works quite well. It runs recursivly over ALL necessary files and in the end I have all needed files for the project. Of course there are some small restrictions I don't want to name then all (e.g. every header of an source file name has the same name, ... ).
Using gcc -M <source_file>, the code is not compiled, it is only processed by the precompiler. And, any solution you may find needs to process the source using the precompiler, to be correct. Imagine that the source, somewhere, has the following snipset:
#ifdef USE_BOOST_SUPERLIB
# include <boost/superlib.hpp>
#endif
then without preprocessing you cannot know if <boost/superlib.hpp> is included.

C/C++ include header file order

What order should include files be specified, i.e. what are the reasons for including one header before another?
For example, do the system files, STL, and Boost go before or after the local include files?
I don't think there's a recommended order, as long as it compiles! What's annoying is when some headers require other headers to be included first... That's a problem with the headers themselves, not with the order of includes.
My personal preference is to go from local to global, each subsection in alphabetical order, i.e.:
h file corresponding to this cpp file (if applicable)
headers from the same component,
headers from other components,
system headers.
My rationale for 1. is that it should prove that each header (for which there is a cpp) can be #included without prerequisites (terminus technicus: header is "self-contained"). And the rest just seems to flow logically from there.
The big thing to keep in mind is that your headers should not be dependent upon other headers being included first. One way to insure this is to include your headers before any other headers.
"Thinking in C++" in particular mentions this, referencing Lakos' "Large Scale C++ Software Design":
Latent usage errors can be avoided by ensuring that the .h file of a component parses by itself – without externally-provided declarations or definitions... Including the .h file as the very first line of the .c file ensures that no critical piece of information intrinsic to the physical interface of the component is missing from the .h file (or, if there is, that you will find out about it as soon as you try to compile the .c file).
That is to say, include in the following order:
The prototype/interface header for this implementation (ie, the .h/.hh file that corresponds to this .cpp/.cc file).
Other headers from the same project, as needed.
Headers from other non-standard, non-system libraries (for example, Qt, Eigen, etc).
Headers from other "almost-standard" libraries (for example, Boost)
Standard C++ headers (for example, iostream, functional, etc.)
Standard C headers (for example, cstdint, dirent.h, etc.)
If any of the headers have an issue with being included in this order, either fix them (if yours) or don't use them. Boycott libraries that don't write clean headers.
Google's C++ style guide argues almost the reverse, with really no justification at all; I personally tend to favor the Lakos approach.
I follow two simple rules that avoid the vast majority of problems:
All headers (and indeed any source files) should include what they need. They should not rely on their users including things.
As an adjunct, all headers should have include guards so that they don't get included multiple times by over-ambitious application of rule 1 above.
I also follow the guidelines of:
Include system headers first (stdio.h, etc) with a dividing line.
Group them logically.
In other words:
#include <stdio.h>
#include <string.h>
#include "btree.h"
#include "collect_hash.h"
#include "collect_arraylist.h"
#include "globals.h"
Although, being guidelines, that's a subjective thing. The rules on the other hand, I enforce rigidly, even to the point of providing 'wrapper' header files with include guards and grouped includes if some obnoxious third-party developer doesn't subscribe to my vision :-)
To add my own brick to the wall.
Each header needs to be self-sufficient, which can only be tested if it's included first at least once
One should not mistakenly modify the meaning of a third-party header by introducing symbols (macro, types, etc.)
So I usually go like this:
// myproject/src/example.cpp
#include "myproject/example.h"
#include <algorithm>
#include <set>
#include <vector>
#include <3rdparty/foo.h>
#include <3rdparty/bar.h>
#include "myproject/another.h"
#include "myproject/specific/bla.h"
#include "detail/impl.h"
Each group separated by a blank line from the next one:
Header corresponding to this cpp file first (sanity check)
System headers
Third-party headers, organized by dependency order
Project headers
Project private headers
Also note that, apart from system headers, each file is in a folder with the name of its namespace, just because it's easier to track them down this way.
I recommend:
The header for the .cc module you're building. (Helps ensure each header in your project doesn't have implicit dependencies on other headers in your project.)
C system files.
C++ system files.
Platform / OS / other header files (e.g. win32, gtk, openGL).
Other header files from your project.
And of course, alphabetical order within each section, where possible.
Always use forward declarations to avoid unnecessary #includes in your header files.
I'm pretty sure this isn't a recommended practice anywhere in the sane world, but I like to line system includes up by filename length, sorted lexically within the same length. Like so:
#include <set>
#include <vector>
#include <algorithm>
#include <functional>
I think it's a good idea to include your own headers before other peoples, to avoid the shame of include-order dependency.
This is not subjective. Make sure your headers don't rely on being #included in specific order. You can be sure it doesn't matter what order you include STL or Boost headers.
First include the header corresponding to the .cpp... in other words, source1.cpp should include source1.h before including anything else. The only exception I can think of is when using MSVC with pre-compiled headers in which case, you are forced to include stdafx.h before anything else.
Reasoning: Including the source1.h before any other files ensures that it can stand alone without it's dependencies. If source1.h takes on a dependency on a later date, the compiler will immediately alert you to add the required forward declarations to source1.h. This in turn ensures that headers can be included in any order by their dependants.
Example:
source1.h
class Class1 {
Class2 c2; // a dependency which has not been forward declared
};
source1.cpp
#include "source1.h" // now compiler will alert you saying that Class2 is undefined
// so you can forward declare Class2 within source1.h
...
MSVC users: I strongly recommend using pre-compiled headers. So, move all #include directives for standard headers (and other headers which are never going to change) to stdafx.h.
Include from the most specific to the least specific, starting with the corresponding .hpp for the .cpp, if one such exists. That way, any hidden dependencies in header files that are not self-sufficient will be revealed.
This is complicated by the use of pre-compiled headers. One way around this is, without making your project compiler-specific, is to use one of the project headers as the precompiled header include file.
Several separate considerations are conflated when deciding for a particular include order. Let try to me untangle.
1. check for self-containedness
Many answers suggest that the include order should act as a check that your headers are self-contained. That mixes up the consideration of testing and compilation
You can check separately whether your headers are self-included. That "static analysis" is independent of any compilation process. For example, run
gcc headerfile.h -fsyntax-only
Testing whether your header files are self-contained can easily be scripted/automated. Even your makefile can do that.
No offense but Lakos' book is from 1996 and putting those different concerns together sounds like 90s-style programming to me. That being said, there are ecosystems (Windows today or in the 90s?) which lack the tools for scripted/automated tests.
2. Readability
Another consideration is readability. When you look up your source file, you just want to easily see what stuff has been included. For that your personal tastes and preferences matter most, though typically you either order them from most specific to least specific or the other way around (I prefer the latter).
Within each group, I usually just include them alphabetically.
3. Does the include order matter?
If your header files are self-contained, then the include order technically shouldn't matter at all for the compilation result.
That is, unless you have (questionable?) specific design choices for your code, such as necessary macro definitions that are not automatically included. In that case, you should reconsider your program design, though it might work perfectly well for you of course.
It is a hard question in the C/C++ world, with so many elements beyond the standard.
I think header file order is not a serious problem as long as it compiles, like squelart said.
My ideas is: If there is no conflict of symbols in all those headers, any order is OK, and the header dependency issue can be fixed later by adding #include lines to the flawed .h.
The real hassle arises when some header changes its action (by checking #if conditions) according to what headers are above.
For example, in stddef.h in VS2005, there is:
#ifdef _WIN64
#define offsetof(s,m) (size_t)( (ptrdiff_t)&(((s *)0)->m) )
#else
#define offsetof(s,m) (size_t)&(((s *)0)->m)
#endif
Now the problem: If I have a custom header ("custom.h") that needs to be used with many compilers, including some older ones that don't provide offsetof in their system headers, I should write in my header:
#ifndef offsetof
#define offsetof(s,m) (size_t)&(((s *)0)->m)
#endif
And be sure to tell the user to #include "custom.h" after all system headers, otherwise, the line of offsetof in stddef.h will assert a macro redefinition error.
We pray not to meet any more of such cases in our career.

C++ #include semantics

This is a multiple question for the same pre-processing instruction.
1 - <> or "" ?
Apart from the info found in the MSDN:
#include Directive (C-C++)
1.a: What are the differences between the two notations?
1.b: Do all compilers implement them the same way?
1.c: When would you use the <>, and when would you use the "" (i.e. what are the criteria you would use to use one or the other for a header include)?
2 - #include {TheProject/TheHeader.hpp} or {TheHeader.hpp} ?
I've seen at least two ways of writing includes of one's project headers.
Considering that you have at least 4 types of headers, that is:
private headers of your project?
headers of your project, but which are exporting symbols (and thus, "public")
headers of another project your module links with
headers of a compiler or standard library
For each kind of headers:
2.a: Would you use <> or "" ?
2.b: Would you include with {TheProject/TheHeader.hpp}, or with {TheHeader.hpp} only?
3 - Bonus
3.a: Do you work on project with sources and/or headers within a tree-like organisation (i.e., directories inside directories, as opposed to "every file in one directory") and what are the pros/cons?
After reading all answers, as well as compiler documentation, I decided I would follow the following standard.
For all files, be them project headers or external headers, always use the pattern:
#include <namespace/header.hpp>
The namespace being at least one directory deep, to avoid collision.
Of course, this means that the project directory where the project headers are should be added as "default include header" to the makefile, too.
The reason for this choice is that I found the following information:
1. The include "" pattern is compiler-dependent
I'll give the answers below
1.a The Standard
Source:
C++14 Working Draft n3797 : https://isocpp.org/files/papers/N3797.pdf
C++11, C++98, C99, C89 (the section quoted is unchanged in all those standards)
In the section 16.2 Source file inclusion, we can read that:
A preprocessing directive of the form
#include <h-char-sequence> new-line
searches a sequence of implementation-defined places for a header identified uniquely by the specified sequence between the < and > delimiters, and causes the replacement of that directive by the entire contents of the header. How the places are specified or the header identified is implementation-defined.
This means that #include <...> will search a file in an implementation defined manner.
Then, the next paragraph:
A preprocessing directive of the form
#include "q-char-sequence" new-line
causes the replacement of that directive by the entire contents of the source file identified by the specified sequence between the " delimiters. The named source file is searched for in an implementation-defined manner. If this search is not supported, or if the search fails, the directive is reprocessed as if it read
#include <h-char-sequence> new-line
with the identical contained sequence (including > characters, if any) from the original directive.
This means that #include "..." will search a file in an implementation defined manner and then, if the file is not found, will do another search as if it had been an #include <...>
The conclusion is that we have to read the compilers documentation.
Note that, for some reason, nowhere in the standards the difference is made between "system" or "library" headers or other headers. The only difference seem that #include <...> seems to target headers, while #include "..." seems to target source (at least, in the english wording).
1.b Visual C++:
Source:
http://msdn.microsoft.com/en-us/library/36k2cdd4.aspx
#include "MyFile.hpp"
The preprocessor searches for include files in the following order:
In the same directory as the file that contains the #include statement.
In the directories of any previously opened include files in the reverse order in which they were opened. The search starts from the directory of the include file that was opened last and continues through the directory of the include file that was opened first.
Along the path specified by each /I compiler option.
(*) Along the paths specified by the INCLUDE environment variable or the development environment default includes.
#include <MyFile.hpp>
The preprocessor searches for include files in the following order:
Along the path specified by each /I compiler option.
(*) Along the paths specified by the INCLUDE environment variable or the development environment default includes.
Note about the last step
The document is not clear about the "Along the paths specified by the INCLUDE environment variable" part for both <...> and "..." includes. The following quote makes it stick with the standard:
For include files that are specified as #include "path-spec", directory searching begins with the directory of the parent file and then proceeds through the directories of any grandparent files. That is, searching begins relative to the directory that contains the source file that contains the #include directive that's being processed. If there is no grandparent file and the file has not been found, the search continues as if the file name were enclosed in angle brackets.
The last step (marked by an asterisk) is thus an interpretation from reading the whole document.
1.c g++
Source:
https://gcc.gnu.org/onlinedocs/cpp/Header-Files.html
https://gcc.gnu.org/onlinedocs/cpp/Include-Syntax.html
https://gcc.gnu.org/onlinedocs/cpp/Include-Operation.html
https://gcc.gnu.org/onlinedocs/cpp/Invocation.html
https://gcc.gnu.org/onlinedocs/cpp/Search-Path.html
https://gcc.gnu.org/onlinedocs/cpp/Once-Only-Headers.html
https://gcc.gnu.org/onlinedocs/cpp/Wrapper-Headers.html
https://gcc.gnu.org/onlinedocs/cpp/System-Headers.html
The following quote summarizes the process:
GCC [...] will look for headers requested with #include <file> in [system directories] [...] All the directories named by -I are searched, in left-to-right order, before the default directories
GCC looks for headers requested with #include "file" first in the directory containing the current file, then in the directories as specified by -iquote options, then in the same places it would have looked for a header requested with angle brackets.
#include "MyFile.hpp"
This variant is used for header files of your own program. The preprocessor searches for include files in the following order:
In the same directory as the file that contains the #include statement.
Along the path specified by each -iquote compiler option.
As for the #include <MyFile.hpp>
#include <MyFile.hpp>
This variant is used for system header files. The preprocessor searches for include files in the following order:
Along the path specified by each -I compiler option.
Inside the system directories.
1.d Oracle/Sun Studio CC
Source:
http://docs.oracle.com/cd/E19205-01/819-5265/bjadq/index.html
Note that the text contradict itself somewhat (see the example to understand). The key phrase is: "The difference is that the current directory is searched only for header files whose names you have enclosed in quotation marks."
#include "MyFile.hpp"
This variant is used for header files of your own program. The preprocessor searches for include files in the following order:
The current directory (that is, the directory containing the “including” file)
The directories named with -I options, if any
The system directory (e.g. the /usr/include directory)
#include <MyFile.hpp>
This variant is used for system header files. The preprocessor searches for include files in the following order:
The directories named with -I options, if any
The system directory (e.g. the /usr/include directory)
1.e XL C/C++ Compiler Reference - IBM/AIX
Source:
http://www.bluefern.canterbury.ac.nz/ucsc%20userdocs/forucscwebsite/c/aix/compiler.pdf
http://www-01.ibm.com/support/docview.wss?uid=swg27024204&aid=1
Both documents are titled "XL C/C++ Compiler Reference" The first document is older (8.0), but is easier to understand. The second is newer (12.1), but is a bit more difficult to decrypt.
#include "MyFile.hpp"
This variant is used for header files of your own program. The preprocessor searches for include files in the following order:
The current directory (that is, the directory containing the “including” file)
The directories named with -I options, if any
The system directory (e.g. the /usr/vac[cpp]/include or /usr/include directories)
#include <MyFile.hpp>
This variant is used for system header files. The preprocessor searches for include files in the following order:
The directories named with -I options, if any
The system directory (e.g. the /usr/vac[cpp]/include or /usr/include directory)
1.e Conclusion
The pattern "" could lead to subtle compilation error across compilers, and as I currently work both on Windows Visual C++, Linux g++, Oracle/Solaris CC and AIX XL, this is not acceptable.
Anyway, the advantage of "" described features are far from interesting anyway, so...
2. Use the {namespace}/header.hpp pattern
I saw at work (i.e. this is not theory, this is real-life, painful professional experience) two headers with the same name, one in the local project directory, and the other in the global include.
As we were using the "" pattern, and that file was included both in local headers and global headers, there was no way to understand what was really going on, when strange errors appeared.
Using the directory in the include would have saved us time because the user would have had to either write:
#include <MyLocalProject/Header.hpp>
or
#include <GlobalInclude/Header.hpp>
You'll note that while
#include "Header.hpp"
would have compiled successfully, thus, still hiding the problem, whereas
#include <Header.hpp>
would not have compiled in normal circonstances.
Thus, sticking to the <> notation would have made mandatory for the developer the prefixing of the include with the right directory, another reason to prefer <> to "".
3. Conclusion
Using both the <> notation and namespaced notation together removes from the pre-compiler the possibility to guess for files, instead searching only the default include directories.
Of course, the standard libraries are still included as usual, that is:
#include <cstdlib>
#include <vector>
I typically use <> for system headers and "" for project headers. As for paths, that is only neccessary if the file you want is in a sub-directory of an include path.
for example, if you need a file in /usr/include/SDL/, but only /usr/include/ is in your include path, then you could just use:
#include <SDL/whatever.h>
Also, keep in mind that unless the path you put starts with a /, it is relative to the current working directory.
EDIT TO ANSWER COMMENT: It depends, if there is only a few includes for a library, I'd just include it's subdirectory in the include path, but if the library has many headers (like dozens) then I'd prefer to just have it in a subdir that I specify. A good example of this is Linux's system headers. You use them like:
#include <sys/io.h>
#include <linux/limits.h>
etc.
EDIT TO INCLUDE ANOTHER GOOD ANSWER: also, if it is conceivable that two or more libraries provide headers by the same name, then the sub-directory solution basically gives each header a namespace.
To quote from the C99 standard (at a glance the wording appears to be identical in the C90 standard, but I can't cut-n-paste from that):
A preprocessing directive of the form
# include "q-char-sequence" new-line
causes the replacement of that
directive by the entire contents of
the source file identified by the
specified sequence between the "
delimiters. The named source file is
searched for in an
implementation-defined manner. If this
search is not supported, or if the
search fails, the directive is
reprocessed as if it read
# include <h-char-sequence> new-line
with the identical contained sequence
(including > characters, if any) from
the original directive.
So the locations searched by #include "whatever" is a super-set of the locations searched by #include <whatever>. The intent is that the first style would be used for headers that in general "belong" to you, and the second method would be used for headers that "belong" to the compiler/environment. Of course, there are often some grey areas - which should you use for Boost headers, for example? I'd use #include <>, but I wouldn't argue too much if someone else on my team wanted #include "".
In practice, I don't think anyone pays much attention to which form is used as long as the build doesn't break. I certainly don't recall it ever being mentioned in a code review (or otherwise, even).
I'll tackle the second part of your question:
I normally use <project/libHeader.h> when I am including headers from a 3rd party. And "myHeader.h" when including headers from within the project.
The reason I use <project/libHeader.h> instead of <libHeader.h> is because it's possible that more than one library has a "libHeader.h" file. In order to include them both you need the library name as part of the included filename.
1.a: What are the differences between the two notations?
"" starts search in the directory where C/C++ file is located. <> starts search in -I directories and in default locations (such as /usr/include). Both of them ultimately search the same set of locations, only the order is different.
1.b: Do all compilers implement them the same way?
I hope so, but I am not sure.
1.c: When would you use the <>, and when would you use the "" (i.e. what are the criteria you would use to use one or the other for a header include)?
I use "" when the include file is supposed to be next to C file, <> in all other cases. IN particular, in our project all "public" include files are in project/include directory, so I use <> for them.
2 - #include {TheProject/TheHeader.hpp} or {TheHeader.hpp} ?
As already pointed out, xxx/filename.h allows you to do things like diskio/ErrorCodes.h and netio/ErrorCodes.h
* private headers of your project?
Private header of my subsystem in project. Use "filename.h"
Public header of my subsystem in project (not visible outside the project, but accessible to other subsystems). Use or , depending on the convention adapted for the project. I'd rather use
* headers of your project, but which are exporting symbols (and thus, "public")
include exactly like the users of your library would include them. Probably
* headers of another project your module links with
Determined by the project, but certainly using <>
* headers of a compiler or standard library
Definitely <>, according to standard.
3.a: Do you work on project with sources and/or headers within a tree-like organisation (i.e., directories inside directories, as opposed to "every file in one directory") and what are the pros/cons?
I do work on a structured project. As soon as you have more than a score of files, some division will become apparent. You should go the way the code is pulling you.
If I remember right.
You use the diamond for all libraries that can be found in your "path". So any library that is in the STL, or ones that you have installed. In Linux your path is usually "/usr/include", in windows I am not sure, but I would guess it is under "C:\windows".
You use the "" then to specify everything else. "my_bla.cpp" with no starting directory information will resolve to the directory your code is residing/compiling in. or you can also specify the exact location of your include with it. Like this "c:\myproj\some_code.cpp"
The type of header doesn't matter, just the location.
Re <> vs "". At my shop, I'm very hands-off as far as matters of "style" are concerned. One of the few areas where I have a requirement is with the use of angle brackets in #include statements -- the rule is this: if you are #including an operating system or compiler file, you may use angle brackets if appropriate. In all other cases, they are forbidden. If you are #including a file written either by someone here or a 3rd party library, <> is forbidden.
The reason is this: #include "x.h" and #include don't search the same paths. #include will only search the system path and whatever you have -i'ed in. Importantly, it will not search the path where the file x.h is located, if that directory isn't included in the search path in some other way.
For example, suppose you have the following files:
c:\dev\angles\main.cpp
#include "c:\utils\mylib\mylibrary.h"
int main()
{
return 0;
}
c:\utils\mylib\mylibrary.h
#ifndef MYLIB_H
#define MYLIB_H
#include <speech.h>
namespace mylib
{
void Speak(SpeechType speechType);
};
#endif
c:\utils\mhlib\speech.h
#ifndef SPEECH_H
#define SPEECH_H
namespace mylib
{
enum SpeechType {Bark, Growl};
};
#endif
This will not compile without changing the path either by setting the PATH environment variable or -i'ing in the c:\utils\mhlib\ directory. The compiler won't be able to resove #include <speech.h> even though that file is in the same directory as mylibrary.h!
We make extensive use of relative and absolute pathnames in #include statements in our code, for two reasons.
1) By keeping libraries and components off the main source tree (ie, putting utility libraries in a special directory), we don;t couple the lifecycle of the library to the lifecycle of the application. This is particularly important when you have several distinct products that use common libraries.
2) We use Junctions to map a physical location on the hard drive to a directory on a logical drive, and then use a fully-qualified path on the logical drive in all #includes. For example:
#include "x:\utils\mylib.h" -- good, x: is a subst'ed drive, and x:\utils points to c:\code\utils_1.0 on the hard drive
#include "c:\utils_1.0\mylib.h" -- bad! the application tha t#includes mylib.h is now coupled to a specific version of the MYLIB library, and all developers must have it on the same directory on thier hard drive, c:\utils_1.0
Finally, a broad but difficult to achieve goal of my team is to be able to support 1-click compiles. This includes being able to compile the main source tree by doing nothing more than getting code from source control and then hitting 'compile'. In particular, I abhor having to set paths & machine-wide #include directories in order to be able to compile, because every little additional step you add to the set-up phase in buildign a development machine just makes it harder, easier to mess up, and it takes longer to get a new machine up to speed & generating code.
There are two primary differences between <> and "". The first is which character will end the name - there are no escape sequences in header names, and so you may be forced to do #include <bla"file.cpp> or "bla>file.cpp". That probably won't come up often, though. The other difference is that system includes aren't supposed to occur on "", just <>. So #include "iostream" is not guaranteed to work; #include <iostream> is. My personal preference is to use "" for files that are part of the project, and <> for files that aren't. Some people only use <> for standard library headers and "" for all else. Some people even use <> only for Boost and std; it depends on the project. Like all style aspects, the most important thing is to be consistent.
As for the path, an external library will specify the convention for headers; e.g. <boost/preprocessor.hpp> <wx/window.h> <Magic++.h>. In a local project, I would write all paths relative to the top-level srcdir (or in a library project where they are different, the include directory).
When writing a library, you may also find it helpful to use <> to differentiate between private and public headers, or to not -I the source directory, but the directory above, so you #include "public_header.hpp" and "src/private_header.hpp". It's really up to you.
EDIT: As for projects with directory structures, I would highly recommend them. Imagine if all of boost were in one directory (and no subnamespaces)! Directory structure is good because it lets you find files easier, and it allows you more flexibility in naming ("module\_text\_processor.hpp" as opposed to "module/text\_processor.hpp"). The latter is more natural and easier to use.
I use <...> from system header file (stdio, iostreams, string etc), and "..." for headers specific to that project.
We use #include "header.h" for headers local to the project and #include for system includes, third party includes, and other projects in the solution. We use Visual Studio, and it's much easier to use the project directory in a header include, this way whenever we create a new project, we only have to specify the include path for the directory containing all the project directories, not a separate path for each project.