Two common questions about include guards:
FIRST QUESTION:
Why aren't include guards protecting my header files from mutual, recursive inclusion? I keep getting errors about non-existing symbols which are obviously there or even weirder syntax errors every time I write something like the following:
"a.h"
#ifndef A_H
#define A_H
#include "b.h"
...
#endif // A_H
"b.h"
#ifndef B_H
#define B_H
#include "a.h"
...
#endif // B_H
"main.cpp"
#include "a.h"
int main()
{
...
}
Why do I get errors compiling "main.cpp"? What do I need to do to solve my problem?
SECOND QUESTION:
Why aren't include guards preventing multiple definitions? For instance, when my project contains two files that include the same header, sometimes the linker complains about some symbol being defined multiple times. For instance:
"header.h"
#ifndef HEADER_H
#define HEADER_H
int f()
{
return 0;
}
#endif // HEADER_H
"source1.cpp"
#include "header.h"
...
"source2.cpp"
#include "header.h"
...
Why is this happening? What do I need to do to solve my problem?
FIRST QUESTION:
Why aren't include guards protecting my header files from mutual, recursive inclusion?
They are.
What they are not helping with is dependencies between the definitions of data structures in mutually-including headers. To see what this means, let's start with a basic scenario and see why include guards do help with mutual inclusions.
Suppose your mutually including a.h and b.h header files have trivial content, i.e. the ellipses in the code sections from the question's text are replaced with the empty string. In this situation, your main.cpp will happily compile. And this is only thanks to your include guards!
If you're not convinced, try removing them:
//================================================
// a.h
#include "b.h"
//================================================
// b.h
#include "a.h"
//================================================
// main.cpp
//
// Good luck getting this to compile...
#include "a.h"
int main()
{
...
}
You'll notice that the compiler will report a failure when it reaches the inclusion depth limit. This limit is implementation-specific. Per Paragraph 16.2/6 of the C++11 Standard:
A #include preprocessing directive may appear in a source file that has been read because of a #include directive in another file, up to an implementation-defined nesting limit.
So what's going on?
When parsing main.cpp, the preprocessor will meet the directive #include "a.h". This directive tells the preprocessor to process the header file a.h, take the result of that processing, and replace the string #include "a.h" with that result;
While processing a.h, the preprocessor will meet the directive #include "b.h", and the same mechanism applies: the preprocessor shall process the header file b.h, take the result of its processing, and replace the #include directive with that result;
When processing b.h, the directive #include "a.h" will tell the preprocessor to process a.h and replace that directive with the result;
The preprocessor will start parsing a.h again, will meet the #include "b.h" directive again, and this will set up a potentially infinite recursive process. When reaching the critical nesting level, the compiler will report an error.
When include guards are present, however, no infinite recursion will be set up in step 4. Let's see why:
(same as before) When parsing main.cpp, the preprocessor will meet the directive #include "a.h". This tells the preprocessor to process the header file a.h, take the result of that processing, and replace the string #include "a.h" with that result;
While processing a.h, the preprocessor will meet the directive #ifndef A_H. Since the macro A_H has not yet been defined, it will keep processing the following text. The subsequent directive (#defines A_H) defines the macro A_H. Then, the preprocessor will meet the directive #include "b.h": the preprocessor shall now process the header file b.h, take the result of its processing, and replace the #include directive with that result;
When processing b.h, the preprocessor will meet the directive #ifndef B_H. Since the macro B_H has not yet been defined, it will keep processing the following text. The subsequent directive (#defines B_H) defines the macro B_H. Then, the directive #include "a.h" will tell the preprocessor to process a.h and replace the #include directive in b.h with the result of preprocessing a.h;
The compiler will start preprocessing a.h again, and meet the #ifndef A_H directive again. However, during previous preprocessing, macro A_H has been defined. Therefore, the compiler will skip the following text this time until the matching #endif directive is found, and the output of this processing is the empty string (supposing nothing follows the #endif directive, of course). The preprocessor will therefore replace the #include "a.h" directive in b.h with the empty string, and will trace back the execution until it replaces the original #include directive in main.cpp.
Thus, include guards do protect against mutual inclusion. However, they can't help with dependencies between the definitions of your classes in mutually-including files:
//================================================
// a.h
#ifndef A_H
#define A_H
#include "b.h"
struct A
{
};
#endif // A_H
//================================================
// b.h
#ifndef B_H
#define B_H
#include "a.h"
struct B
{
A* pA;
};
#endif // B_H
//================================================
// main.cpp
//
// Good luck getting this to compile...
#include "a.h"
int main()
{
...
}
Given the above headers, main.cpp will not compile.
Why is this happening?
To see what's going on, it is enough to go through steps 1-4 again.
It is easy to see that the first three steps and most of the fourth step are unaffected by this change (just read through them to get convinced). However, something different happens at the end of step 4: after replacing the #include "a.h" directive in b.h with the empty string, the preprocessor will start parsing the content of b.h and, in particular, the definition of B. Unfortunately, the definition of B mentions class A, which has never been met before exactly because of the inclusion guards!
Declaring a member variable of a type which has not been previously declared is, of course, an error, and the compiler will politely point that out.
What do I need to do to solve my problem?
You need forward declarations.
In fact, the definition of class A is not required in order to define class B, because a pointer to A is being declared as a member variable, and not an object of type A. Since pointers have fixed size, the compiler won't need to know the exact layout of A nor to compute its size in order to properly define class B. Hence, it is enough to forward-declare class A in b.h and make the compiler aware of its existence:
//================================================
// b.h
#ifndef B_H
#define B_H
// Forward declaration of A: no need to #include "a.h"
struct A;
struct B
{
A* pA;
};
#endif // B_H
Your main.cpp will now certainly compile. A couple of remarks:
Not only breaking the mutual inclusion by replacing the #include directive with a forward declaration in b.h was enough to effectively express the dependency of B on A: using forward declarations whenever possible/practical is also considered to be a good programming practice, because it helps avoiding unnecessary inclusions, thus reducing the overall compilation time. However, after eliminating the mutual inclusion, main.cpp will have to be modified to #include both a.h and b.h (if the latter is needed at all), because b.h is no more indirectly #included through a.h;
While a forward declaration of class A is enough for the compiler to declare pointers to that class (or to use it in any other context where incomplete types are acceptable), dereferencing pointers to A (for instance to invoke a member function) or computing its size are illegal operations on incomplete types: if that is needed, the full definition of A needs to be available to the compiler, which means the header file that defines it must be included. This is why class definitions and the implementation of their member functions are usually split into a header file and an implementation file for that class (class templates are an exception to this rule): implementation files, which are never #included by other files in the project, can safely #include all the necessary headers to make definitions visible. Header files, on the other hand, won't #include other header files unless they really need to do so (for instance, to make the definition of a base class visible), and will use forward-declarations whenever possible/practical.
SECOND QUESTION:
Why aren't include guards preventing multiple definitions?
They are.
What they are not protecting you from is multiple definitions in separate translation units. This is also explained in this Q&A on StackOverflow.
Too see that, try removing the include guards and compiling the following, modified version of source1.cpp (or source2.cpp, for what it matters):
//================================================
// source1.cpp
//
// Good luck getting this to compile...
#include "header.h"
#include "header.h"
int main()
{
...
}
The compiler will certainly complain here about f() being redefined. That's obvious: its definition is being included twice! However, the above source1.cpp will compile without problems when header.h contains the proper include guards. That's expected.
Still, even when the include guards are present and the compiler will stop bothering you with error message, the linker will insist on the fact that multiple definitions being found when merging the object code obtained from the compilation of source1.cpp and source2.cpp, and will refuse to generate your executable.
Why is this happening?
Basically, each .cpp file (the technical term in this context is translation unit) in your project is compiled separately and independently. When parsing a .cpp file, the preprocessor will process all the #include directives and expand all macro invocations it encounters, and the output of this pure text processing will be given in input to the compiler for translating it into object code. Once the compiler is done with producing the object code for one translation unit, it will proceed with the next one, and all the macro definitions that have been encountered while processing the previous translation unit will be forgotten.
In fact, compiling a project with n translation units (.cpp files) is like executing the same program (the compiler) n times, each time with a different input: different executions of the same program won't share the state of the previous program execution(s). Thus, each translation is performed independently and the preprocessor symbols encountered while compiling one translation unit will not be remembered when compiling other translation units (if you think about it for a moment, you will easily realize that this is actually a desirable behavior).
Therefore, even though include guards help you preventing recursive mutual inclusions and redundant inclusions of the same header in one translation unit, they can't detect whether the same definition is included in different translation unit.
Yet, when merging the object code generated from the compilation of all the .cpp files of your project, the linker will see that the same symbol is defined more than once, and since this violates the One Definition Rule. Per Paragraph 3.2/3 of the C++11 Standard:
Every program shall contain exactly one definition of every non-inline function or variable that is odr-used in that program; no diagnostic required. The definition can appear explicitly in the program, it can be found in the standard or a user-defined library, or (when appropriate) it is implicitly defined (see 12.1, 12.4 and 12.8). An inline function shall be defined in every translation unit in which it is odr-used.
Hence, the linker will emit an error and refuse to generate the executable of your program.
What do I need to do to solve my problem?
If you want to keep your function definition in a header file that is #included by multiple translation units (notice, that no problem will arise if your header is #included just by one translation unit), you need to use the inline keyword.
Otherwise, you need to keep only the declaration of your function in header.h, putting its definition (body) into one separate .cpp file only (this is the classical approach).
The inline keyword represents a non-binding request to the compiler to inline the function's body directly at the call site, rather than setting up a stack frame for a regular function call. Although the compiler doesn't have to fulfill your request, the inline keyword does succeed in telling the linker to tolerate multiple symbol definitions. According to Paragraph 3.2/5 of the C++11 Standard:
There can be more than one definition of a class type (Clause 9), enumeration type (7.2), inline function with external linkage (7.1.2), class template (Clause 14), non-static function template (14.5.6), static data member of a class template (14.5.1.3), member function of a class template (14.5.1.1), or template specialization for which some template parameters are not specified (14.7, 14.5.5) in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements [...]
The above Paragraph basically lists all the definitions which are commonly put in header files, because they can be safely included in multiple translation units. All other definitions with external linkage, instead, belong in source files.
Using the static keyword instead of the inline keyword also results in suppressing linker errors by giving your function internal linkage, thus making each translation unit hold a private copy of that function (and of its local static variables). However, this eventually results in a larger executable, and the use of inline should be preferred in general.
An alternative way of achieving the same result as with the static keyword is to put function f() in an unnamed namespace. Per Paragraph 3.5/4 of the C++11 Standard:
An unnamed namespace or a namespace declared directly or indirectly within an unnamed namespace has internal linkage. All other namespaces have external linkage. A name having namespace scope that has not been given internal linkage above has the same linkage as the enclosing namespace if it is the name of:
— a variable; or
— a function; or
— a named class (Clause 9), or an unnamed class defined in a typedef declaration in which the class has the typedef name for linkage purposes (7.1.3); or
— a named enumeration (7.2), or an unnamed enumeration defined in a typedef declaration in which the enumeration has the typedef name for linkage purposes (7.1.3); or
— an enumerator belonging to an enumeration with linkage; or
— a template.
For the same reason mentioned above, the inline keyword should be preferred.
fiorentinoing's answer is echoed in Git 2.24 (Q4 2019), where a similar code cleanup is taking place in the Git codebase.
See commit 2fe4439 (03 Oct 2019) by René Scharfe (rscharfe).
(Merged by Junio C Hamano -- gitster -- in commit a4c5d9f, 11 Oct 2019)
treewide: remove duplicate #include directives
Found with:
git grep '^#include ' '*.c' | sort | uniq -d
First of all you should be 100% sure that you have no duplicates in "include guards".
With this command
grep -rh "#ifndef" * 2>&1 | uniq -c | sort -rn | awk '{print $1 " " $3}' | grep -v "^1\ "
you will 1) highlight all include guards, get unique row with counter per include name, sort the results, print only counter and include name and remove the ones that are really unique.
HINT: this is equivalent to get the list of duplicated include names
Related
I have a c++ header file containing a class.
I want to use this class in several projects, bu I don't want to create a separate library for it, so I'm putting both methods declarations and definitions in the header file:
// example.h
#ifndef EXAMPLE_H_
#define EXAMPLE_H_
namespace test_ns{
class TestClass{
public:
void testMethod();
};
void TestClass::testMethod(){
// some code here...
}
} // end namespace test_ns
#endif
If inside the same project I include this header from more than one cpp file, I get an error saying "multiple definition of test_ns::TestClass::testMethod()", while if I put the method definition inside the class body this does not happen:
// example.h
#ifndef EXAMPLE_H_
#define EXAMPLE_H_
namespace test_ns{
class TestClass{
public:
void testMethod(){
// some code here...
}
};
} // end namespace test_ns
#endif
Since the class is defined inside a namespace, shouldn't the two forms be equivalent? Why is the method considered to be defined twice in the first case?
Inside the class body is considered to be inline by the compiler.
If you implement outside of body, but still in header, you have to mark the method as 'inline' explicitly.
namespace test_ns{
class TestClass{
public:
inline void testMethod();
};
void TestClass::testMethod(){
// some code here...
}
} // end namespace test_ns
Edit
For myself it often helps to solve these kinds of compile problems by realizing that the compiler does not see anything like a header file. Header files are preprocessed and the compiler just sees one huge file containing every line from every (recursively) included file. Normally the starting point for these recursive includes is a cpp source file that is being compiled.
In our company, even a modest looking cpp file can be presented to the compiler as a 300000 line monster.
So when a method, that is not declared inline, is implemented in a header file, the compiler could end up seeing void TestClass::testMethod() {...} dozens of times in the preprocessed file. Now you can see that this does not make sense, same effect as you'd get when copy/pasting it multiple times in one source file.
And even if you succeeded by only having it once in every compilation unit, by some form of conditional compilation ( e.g. using inclusion brackets ) the linker would still find this method's symbol to be in multiple compiled units ( object files ).
These are not equivalent. The second example given has an implicit 'inline' modifier on the method and so the compiler will reconcile multiple definitions itself (most likely with internal linkage of the method if it isn't inlineable).
The first example isn't inline and so if this header is included in multiple translation units then you will have multiple definitions and linker errors.
Also, headers should really always be guarded to prevent multiple definition errors in the same translation unit. That should convert your header to:
#ifndef EXAMPLE_H
#define EXAMPLE_H
//define your class here
#endif
Don't put a function/method definition in an header file unless they are inlined (by defining them directly in a class declaration or explicity specified by the inline keyword)
header files are (mostly) for declaration (whatever you need to declare). Definitions allowed are the ones for constants and inlined functions/methods (and templates too).
Actually it is possible to have definitions in a single header file (without a separate .c/.cpp file) and still be able to use it from multiple source files.
Consider this foobar.h header:
#ifndef FOOBAR_H
#define FOOBAR_H
/* write declarations normally */
void foo();
void bar();
/* use conditional compilation to disable definitions when necessary */
#ifndef ONLY_DECLARATIONS
void foo() {
/* your code goes here */
}
void bar() {
/* your code goes here */
}
#endif /* ONLY_DECLARATIONS */
#endif /* FOOBAR_H */
If you use this header in only one source file, include and use it normally.
Like in main.c:
#include "foobar.h"
int main(int argc, char *argv[]) {
foo();
}
If there're other source files in your project which require foobar.h, then #define ONLY_DECLARATIONS macro before including it.
In use_bar.c you may write:
#define ONLY_DECLARATIONS
#include "foobar.h"
void use_bar() {
bar();
}
After compilation use_bar.o and main.o can be linked together without errors, because only one of them (main.o) will have implementation of foo() and bar().
That's slightly non-idiomatic, but it allows to keep definitions and declarations together in one file. I feel like it's a poor man's substitute for true modules.
Your first code snippet is falling foul of C++'s "One Definition Rule" - see here for a link to a Wikipedia article describing ODR. You're actually falling foul of point #2 because every time the compiler includes the header file into a source file, you run into the risk of the compiler generating a globally visible definition of test_ns::TestClass::testMethod(). And of course by the time you get to link the code, the linker will have kittens because it will find the same symbol in multiple object files.
The second snippet works because you've inlined the definition of the function, which means that even if the compiler doesn't generate any inline code for the function (say, you've got inlining turned off or the compiler decides the function is too big to inline), the code generated for the function definition will be visible in the translation unit only, as if you'd stuck it in an anonymous namespace. Hence you get multiple copies of the function in the generated object code that the linker may or may not optimize away depending on how smart it is.
You could achieve a similar effect in your first code snippet by prefixing TestClass::testMethod() with inline.
//Baseclass.h or .cpp
#ifndef CDerivedclass
#include "Derivedclass.h"
#endif
or
//COthercls.h or .cpp
#ifndef CCommonheadercls
#include "Commonheadercls.h"
#endif
I think this suffice all instances.
Can a #define or similar pre-processor definition be made across all translation units?
Header implementations are useful for really small libraries since all the code can be contained and distributed with a single header with the following structure:
// library.h
void libFunc(); // forward decl
#ifdef IMPLEMENT_LIBRARY
int libState;
volatile int libVolState; // library state exposed to external processes
void libFunc(){
// definition
}
#endif
This structure however requires the user to define IMPLEMENT_LIBRARY before the header's inclusion in only one of their translation units, meaning it can't be put in the user's header files, and might be a little confusing to someone who isn't wholely familiar with C++'s compilation rules.
If there were a way to define IMPLEMENT_LIBRARY across all TU, this could be done automatically with
#ifndef IMPLEMENT_LIBRARY
#defineToAllUnits IMPLEMENT_LIBRARY
// library state
// definitions
#endif
Does such a mechanism exist, or is the current single-header system as good as it's gonna get?
Some compilation units could very well have been compiled before the one which would contain a #defineToAllUnits, so that is not possible.
In practice your problem is often solved by using the build system to pass a -DIMPLEMENTAT_LIBRARY option to compiler (or equivalent syntax). Another possibility, common when trying to achieve a wide portability with several defines, is to have a configuration header like config.h included everywhere. That header can be autogenerated at configuration time.
You can also avoid infringing the ODR by using inline functions and variables.
For this use case, you probably should not use macro definition at all. If you want the function to be defined in the TU of library user, you can use an inline function. Inline functions may be defined in more than one TU (as long as the definition is same):
// library.h
inline void libFunc(){
// definition
}
Another approach would be to compile the library separately, and have the user of the library link with it instead of having the definition inside their own TU.
Regarding the question itself
Can a #define or similar pre-processor definition be made across all translation units?
It is OK to #define pre-processor macros in more than one TU:
// a.cpp
#define foo a
// b.cpp
#define foo b
If you want the definitions to match across all TU that define it, you can put the macro definition into a header file, and include that:
// h.hpp
#define foo h
// a.cpp
#include "h.hpp"
// b.cpp
#include "h.hpp"
It is not possible to "inject" definitions from one TU into other TU's, so there is no possible equivalent of "#defineToAllUnits". It is typically possible to "inject" macro definitions from the compiler invocation: gcc a.cpp b.cpp -Dfoo=h. I don't see this being useful for your use case however.
Two common questions about include guards:
FIRST QUESTION:
Why aren't include guards protecting my header files from mutual, recursive inclusion? I keep getting errors about non-existing symbols which are obviously there or even weirder syntax errors every time I write something like the following:
"a.h"
#ifndef A_H
#define A_H
#include "b.h"
...
#endif // A_H
"b.h"
#ifndef B_H
#define B_H
#include "a.h"
...
#endif // B_H
"main.cpp"
#include "a.h"
int main()
{
...
}
Why do I get errors compiling "main.cpp"? What do I need to do to solve my problem?
SECOND QUESTION:
Why aren't include guards preventing multiple definitions? For instance, when my project contains two files that include the same header, sometimes the linker complains about some symbol being defined multiple times. For instance:
"header.h"
#ifndef HEADER_H
#define HEADER_H
int f()
{
return 0;
}
#endif // HEADER_H
"source1.cpp"
#include "header.h"
...
"source2.cpp"
#include "header.h"
...
Why is this happening? What do I need to do to solve my problem?
FIRST QUESTION:
Why aren't include guards protecting my header files from mutual, recursive inclusion?
They are.
What they are not helping with is dependencies between the definitions of data structures in mutually-including headers. To see what this means, let's start with a basic scenario and see why include guards do help with mutual inclusions.
Suppose your mutually including a.h and b.h header files have trivial content, i.e. the ellipses in the code sections from the question's text are replaced with the empty string. In this situation, your main.cpp will happily compile. And this is only thanks to your include guards!
If you're not convinced, try removing them:
//================================================
// a.h
#include "b.h"
//================================================
// b.h
#include "a.h"
//================================================
// main.cpp
//
// Good luck getting this to compile...
#include "a.h"
int main()
{
...
}
You'll notice that the compiler will report a failure when it reaches the inclusion depth limit. This limit is implementation-specific. Per Paragraph 16.2/6 of the C++11 Standard:
A #include preprocessing directive may appear in a source file that has been read because of a #include directive in another file, up to an implementation-defined nesting limit.
So what's going on?
When parsing main.cpp, the preprocessor will meet the directive #include "a.h". This directive tells the preprocessor to process the header file a.h, take the result of that processing, and replace the string #include "a.h" with that result;
While processing a.h, the preprocessor will meet the directive #include "b.h", and the same mechanism applies: the preprocessor shall process the header file b.h, take the result of its processing, and replace the #include directive with that result;
When processing b.h, the directive #include "a.h" will tell the preprocessor to process a.h and replace that directive with the result;
The preprocessor will start parsing a.h again, will meet the #include "b.h" directive again, and this will set up a potentially infinite recursive process. When reaching the critical nesting level, the compiler will report an error.
When include guards are present, however, no infinite recursion will be set up in step 4. Let's see why:
(same as before) When parsing main.cpp, the preprocessor will meet the directive #include "a.h". This tells the preprocessor to process the header file a.h, take the result of that processing, and replace the string #include "a.h" with that result;
While processing a.h, the preprocessor will meet the directive #ifndef A_H. Since the macro A_H has not yet been defined, it will keep processing the following text. The subsequent directive (#defines A_H) defines the macro A_H. Then, the preprocessor will meet the directive #include "b.h": the preprocessor shall now process the header file b.h, take the result of its processing, and replace the #include directive with that result;
When processing b.h, the preprocessor will meet the directive #ifndef B_H. Since the macro B_H has not yet been defined, it will keep processing the following text. The subsequent directive (#defines B_H) defines the macro B_H. Then, the directive #include "a.h" will tell the preprocessor to process a.h and replace the #include directive in b.h with the result of preprocessing a.h;
The compiler will start preprocessing a.h again, and meet the #ifndef A_H directive again. However, during previous preprocessing, macro A_H has been defined. Therefore, the compiler will skip the following text this time until the matching #endif directive is found, and the output of this processing is the empty string (supposing nothing follows the #endif directive, of course). The preprocessor will therefore replace the #include "a.h" directive in b.h with the empty string, and will trace back the execution until it replaces the original #include directive in main.cpp.
Thus, include guards do protect against mutual inclusion. However, they can't help with dependencies between the definitions of your classes in mutually-including files:
//================================================
// a.h
#ifndef A_H
#define A_H
#include "b.h"
struct A
{
};
#endif // A_H
//================================================
// b.h
#ifndef B_H
#define B_H
#include "a.h"
struct B
{
A* pA;
};
#endif // B_H
//================================================
// main.cpp
//
// Good luck getting this to compile...
#include "a.h"
int main()
{
...
}
Given the above headers, main.cpp will not compile.
Why is this happening?
To see what's going on, it is enough to go through steps 1-4 again.
It is easy to see that the first three steps and most of the fourth step are unaffected by this change (just read through them to get convinced). However, something different happens at the end of step 4: after replacing the #include "a.h" directive in b.h with the empty string, the preprocessor will start parsing the content of b.h and, in particular, the definition of B. Unfortunately, the definition of B mentions class A, which has never been met before exactly because of the inclusion guards!
Declaring a member variable of a type which has not been previously declared is, of course, an error, and the compiler will politely point that out.
What do I need to do to solve my problem?
You need forward declarations.
In fact, the definition of class A is not required in order to define class B, because a pointer to A is being declared as a member variable, and not an object of type A. Since pointers have fixed size, the compiler won't need to know the exact layout of A nor to compute its size in order to properly define class B. Hence, it is enough to forward-declare class A in b.h and make the compiler aware of its existence:
//================================================
// b.h
#ifndef B_H
#define B_H
// Forward declaration of A: no need to #include "a.h"
struct A;
struct B
{
A* pA;
};
#endif // B_H
Your main.cpp will now certainly compile. A couple of remarks:
Not only breaking the mutual inclusion by replacing the #include directive with a forward declaration in b.h was enough to effectively express the dependency of B on A: using forward declarations whenever possible/practical is also considered to be a good programming practice, because it helps avoiding unnecessary inclusions, thus reducing the overall compilation time. However, after eliminating the mutual inclusion, main.cpp will have to be modified to #include both a.h and b.h (if the latter is needed at all), because b.h is no more indirectly #included through a.h;
While a forward declaration of class A is enough for the compiler to declare pointers to that class (or to use it in any other context where incomplete types are acceptable), dereferencing pointers to A (for instance to invoke a member function) or computing its size are illegal operations on incomplete types: if that is needed, the full definition of A needs to be available to the compiler, which means the header file that defines it must be included. This is why class definitions and the implementation of their member functions are usually split into a header file and an implementation file for that class (class templates are an exception to this rule): implementation files, which are never #included by other files in the project, can safely #include all the necessary headers to make definitions visible. Header files, on the other hand, won't #include other header files unless they really need to do so (for instance, to make the definition of a base class visible), and will use forward-declarations whenever possible/practical.
SECOND QUESTION:
Why aren't include guards preventing multiple definitions?
They are.
What they are not protecting you from is multiple definitions in separate translation units. This is also explained in this Q&A on StackOverflow.
Too see that, try removing the include guards and compiling the following, modified version of source1.cpp (or source2.cpp, for what it matters):
//================================================
// source1.cpp
//
// Good luck getting this to compile...
#include "header.h"
#include "header.h"
int main()
{
...
}
The compiler will certainly complain here about f() being redefined. That's obvious: its definition is being included twice! However, the above source1.cpp will compile without problems when header.h contains the proper include guards. That's expected.
Still, even when the include guards are present and the compiler will stop bothering you with error message, the linker will insist on the fact that multiple definitions being found when merging the object code obtained from the compilation of source1.cpp and source2.cpp, and will refuse to generate your executable.
Why is this happening?
Basically, each .cpp file (the technical term in this context is translation unit) in your project is compiled separately and independently. When parsing a .cpp file, the preprocessor will process all the #include directives and expand all macro invocations it encounters, and the output of this pure text processing will be given in input to the compiler for translating it into object code. Once the compiler is done with producing the object code for one translation unit, it will proceed with the next one, and all the macro definitions that have been encountered while processing the previous translation unit will be forgotten.
In fact, compiling a project with n translation units (.cpp files) is like executing the same program (the compiler) n times, each time with a different input: different executions of the same program won't share the state of the previous program execution(s). Thus, each translation is performed independently and the preprocessor symbols encountered while compiling one translation unit will not be remembered when compiling other translation units (if you think about it for a moment, you will easily realize that this is actually a desirable behavior).
Therefore, even though include guards help you preventing recursive mutual inclusions and redundant inclusions of the same header in one translation unit, they can't detect whether the same definition is included in different translation unit.
Yet, when merging the object code generated from the compilation of all the .cpp files of your project, the linker will see that the same symbol is defined more than once, and since this violates the One Definition Rule. Per Paragraph 3.2/3 of the C++11 Standard:
Every program shall contain exactly one definition of every non-inline function or variable that is odr-used in that program; no diagnostic required. The definition can appear explicitly in the program, it can be found in the standard or a user-defined library, or (when appropriate) it is implicitly defined (see 12.1, 12.4 and 12.8). An inline function shall be defined in every translation unit in which it is odr-used.
Hence, the linker will emit an error and refuse to generate the executable of your program.
What do I need to do to solve my problem?
If you want to keep your function definition in a header file that is #included by multiple translation units (notice, that no problem will arise if your header is #included just by one translation unit), you need to use the inline keyword.
Otherwise, you need to keep only the declaration of your function in header.h, putting its definition (body) into one separate .cpp file only (this is the classical approach).
The inline keyword represents a non-binding request to the compiler to inline the function's body directly at the call site, rather than setting up a stack frame for a regular function call. Although the compiler doesn't have to fulfill your request, the inline keyword does succeed in telling the linker to tolerate multiple symbol definitions. According to Paragraph 3.2/5 of the C++11 Standard:
There can be more than one definition of a class type (Clause 9), enumeration type (7.2), inline function with external linkage (7.1.2), class template (Clause 14), non-static function template (14.5.6), static data member of a class template (14.5.1.3), member function of a class template (14.5.1.1), or template specialization for which some template parameters are not specified (14.7, 14.5.5) in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements [...]
The above Paragraph basically lists all the definitions which are commonly put in header files, because they can be safely included in multiple translation units. All other definitions with external linkage, instead, belong in source files.
Using the static keyword instead of the inline keyword also results in suppressing linker errors by giving your function internal linkage, thus making each translation unit hold a private copy of that function (and of its local static variables). However, this eventually results in a larger executable, and the use of inline should be preferred in general.
An alternative way of achieving the same result as with the static keyword is to put function f() in an unnamed namespace. Per Paragraph 3.5/4 of the C++11 Standard:
An unnamed namespace or a namespace declared directly or indirectly within an unnamed namespace has internal linkage. All other namespaces have external linkage. A name having namespace scope that has not been given internal linkage above has the same linkage as the enclosing namespace if it is the name of:
— a variable; or
— a function; or
— a named class (Clause 9), or an unnamed class defined in a typedef declaration in which the class has the typedef name for linkage purposes (7.1.3); or
— a named enumeration (7.2), or an unnamed enumeration defined in a typedef declaration in which the enumeration has the typedef name for linkage purposes (7.1.3); or
— an enumerator belonging to an enumeration with linkage; or
— a template.
For the same reason mentioned above, the inline keyword should be preferred.
fiorentinoing's answer is echoed in Git 2.24 (Q4 2019), where a similar code cleanup is taking place in the Git codebase.
See commit 2fe4439 (03 Oct 2019) by René Scharfe (rscharfe).
(Merged by Junio C Hamano -- gitster -- in commit a4c5d9f, 11 Oct 2019)
treewide: remove duplicate #include directives
Found with:
git grep '^#include ' '*.c' | sort | uniq -d
First of all you should be 100% sure that you have no duplicates in "include guards".
With this command
grep -rh "#ifndef" * 2>&1 | uniq -c | sort -rn | awk '{print $1 " " $3}' | grep -v "^1\ "
you will 1) highlight all include guards, get unique row with counter per include name, sort the results, print only counter and include name and remove the ones that are really unique.
HINT: this is equivalent to get the list of duplicated include names
I am trying to address this issues because I can not seem to find a good resource online that does so.
I do not fully understand the issue cause I never got it resolved so I will try to describe it as best as possible.
Awhile ago I had an issue where headers were being ignored because "They were called once already and therefore when they were called again by another document, it was being ignored and therefore an error was being thrown"
I never fully understood that because you can call a header more then once without errors being thrown
header1.h
#ifndef _FCLASS_
#define _FCLASS_
class firstClass {
...//declaration
}
#endif
header2.h
#ifndef _SCLASS_
#define _SCLASS_
#include "header1.h"
class SecondClass:firstClass{
...//declaration
}
#endif
header3.h
#ifndef _TCLASS_
#define _TCLASS_
#include "header1.h"
class thirdClass:firstClass{
...//declaration
}
#endif
In the above example, the header1 class was called twice, and no errors should of been thrown. Even though header1 was declared once, it could be used by multiple headers.
So my question is , in what scenarios, can a header actually be ignored by a document if it was already declared once.
Does this type of issue only apply to .cpp files that include headers ??
This most commonly happens when there's a header cycle (two headers including each other being the most likely case).
Just for illustration, real examples are usually more complex.
A.h:
#ifndef A_INCLUDED
#define A_INCLUDED
#include "B.h"
class A
{
};
class C
{
B b;
}
#endif
B.h:
#ifndef B_INCLUDED
#define B_INCLUDED
#include "A.h"
class B : public A
{
};
#endif
In this case, regardless of which header you include, there's a circular dependency that prevents it from compiling. If you include A.h, it includes B.h. B.h then includes A.h but the include guard prevents it from being included again. So then it continues compiling B.h and fails when it doesn't know about class A. The reverse situation applies if including B.h first.
EDIT: I should have mentioned that this is typically solved by refactoring logic out of one or both headers into another header, and/or using forward declarations instead of actual includes.
I can think of one scenario. If header1.h has sections that are excluded by conditional compilation and those conditions change by declarations in subsequently included header files. Later inclusions of header1.h will be skipped by its include guard and it won't be able to declare the conditional sections which could lead to errors.
C++ headers
If I have A.cpp and A.h as well as b.h, c.h, d.h
Should I do:
in A.h:
#include "b.h"
#include "c.h"
#include "d.h"
in A.cpp:
#include "A.h"
or
in A.cpp:
#include "A.h"
#include "b.h"
#include "c.h"
#include "d.h"
Are there performance issues? Obvious benefits? Is something bad about this?
You should only include what is necessary to compile; adding unnecessary includes will hurt your compilation times, especially in large projects.
Each header file should be able to compile cleanly on its own -- that is, if you have a source file that includes only that header, it should compile without errors. The header file should include no more than is necessary for that.
Try to use forward declarations as much as possible. If you're using a class, but the header file only deals with pointers/references to objects of that class, then there's no need to include the definition of the class -- just use a forward declaration:
class SomeClass;
// Can now use pointers/references to SomeClass
// without needing the full definition
The key practice here is having around each foo.h file a guard such as:
#ifndef _FOO_H
#define _FOO_H
...rest of the .h file...
#endif
This prevents multiple-inclusions, with loops and all such attendant horrors. Once you do ensure every include file is thus guarded, the specifics are less important.
I like one guiding principle Adam expresses: make sure that, if a source file just includes a.h, it won't inevitably get errors due to a.h assuming other files have been included before it -- e.g. if a.h requires b.h to be included before, it can and should just include b.h itself (the guards will make that a noop if b.h was already included previously)
Vice versa, a source file should include the headers from which it is requiring something (macros, declarations, etc), not assume that other headers just come in magically because it has included some.
If you're using classes by value, alas, you need all the gory details of the class in some .h you include. But for some uses via references or pointers, just a bare class sic; will suffice. E.g., all other things being equal, if class a can get away with a pointer to an instance of class b (i.e. a member class b *my_bp; rather than a member class b *my_b;) the coupling between the include files can be made weaker (reducing much recompilation) -- e.g. b.h could have little more than class b; while all the gory details are in b_impl.h which is included only by the headers that really need it...
What Adam Rosenfield told you is spot on. An example of when you can use a forward declaration is:
#ifndef A_H_
#define A_H_
#include "D.h"
class B; //forward declaration
class C; //forward declaration
class A
{
B *m_pb; //you can forward declare this one bacause it's a pointer and the compilier doesn't need to know the size of object B at this point in the code. include B.h in the cpp file.
C &m_rc; //you can also forware declare this one for the same reason, except it's a reference.
D m_d; //you cannot forward declare this one because the complier need to calc the size of object D.
};
#endif
Answer: Let A.h include b.h, c.h and d.h only if it's needed to make the build succeed. The rule of thumb is: only include as much code in a header file as necessary. Anything which is not immediately needed in A.h should be included by A.cpp.
Reasoning: The less code is included into header files, the less likely you will need to recompile the code which uses the header file after doing some change somewhere. If there are lots of #include references between different header files, changing any of them will require rebuilding all other files which include the changed header file - recursivel. So if you decide to touch some toplevel header file, this might rebuild huge parts of your code.
By using forward declarations wherever possible in your header files, you reduce the coupling of the source files and thus make the build faster. Such forward declarations can be used in many more situations than you might think. As a rule of thumb, you only need the header file t.h (which defines a type T) if you
Declare a member variable of type T (note, this does not include declaring a pointer-to-T).
Write some inline functions which access members of an object of type T.
You do not need to include the declaration of T if your header file just
Declares constructors/functions which take references or pointers to a T object.
Declares functions which return a T object by pointer, reference or value.
Declares member variables which are references or pointers to a T object.
Consider this class declaration; which of the include files for A, B and C do you really need to include?:
class MyClass
{
public:
MyClass( const A &a );
void set( const B &b );
void set( const B *b );
B getB();
C getC();
private:
B *m_b;
C m_c;
};
You just need the include file for the type C, because of the member variable m_c. You can often remove this requirement as well by not declaring your member variables directly but using an opaque pointer to hide all the member variables in a private structure, so that they don't show up in the header file anymore.
Prefer not including headers in other headers - it slows down compilation and leas to circular reference
There would be no performance issues, it's all done at compile time, and your headers should be set up so they can't be included more than once.
I don't know if there's a standard way to do it, but I prefer to include all the headers I need in source files, and include them in the headers if something in the header itself needs it (for example, a typedef from a different header)