how to avoid header coupling - c++

I am trying to address this issues because I can not seem to find a good resource online that does so.
I do not fully understand the issue cause I never got it resolved so I will try to describe it as best as possible.
Awhile ago I had an issue where headers were being ignored because "They were called once already and therefore when they were called again by another document, it was being ignored and therefore an error was being thrown"
I never fully understood that because you can call a header more then once without errors being thrown
header1.h
#ifndef _FCLASS_
#define _FCLASS_
class firstClass {
...//declaration
}
#endif
header2.h
#ifndef _SCLASS_
#define _SCLASS_
#include "header1.h"
class SecondClass:firstClass{
...//declaration
}
#endif
header3.h
#ifndef _TCLASS_
#define _TCLASS_
#include "header1.h"
class thirdClass:firstClass{
...//declaration
}
#endif
In the above example, the header1 class was called twice, and no errors should of been thrown. Even though header1 was declared once, it could be used by multiple headers.
So my question is , in what scenarios, can a header actually be ignored by a document if it was already declared once.
Does this type of issue only apply to .cpp files that include headers ??

This most commonly happens when there's a header cycle (two headers including each other being the most likely case).
Just for illustration, real examples are usually more complex.
A.h:
#ifndef A_INCLUDED
#define A_INCLUDED
#include "B.h"
class A
{
};
class C
{
B b;
}
#endif
B.h:
#ifndef B_INCLUDED
#define B_INCLUDED
#include "A.h"
class B : public A
{
};
#endif
In this case, regardless of which header you include, there's a circular dependency that prevents it from compiling. If you include A.h, it includes B.h. B.h then includes A.h but the include guard prevents it from being included again. So then it continues compiling B.h and fails when it doesn't know about class A. The reverse situation applies if including B.h first.
EDIT: I should have mentioned that this is typically solved by refactoring logic out of one or both headers into another header, and/or using forward declarations instead of actual includes.

I can think of one scenario. If header1.h has sections that are excluded by conditional compilation and those conditions change by declarations in subsequently included header files. Later inclusions of header1.h will be skipped by its include guard and it won't be able to declare the conditional sections which could lead to errors.

Related

Is using headers multiple times bad?

Lets say I am using header guards,
#ifndef MAIN_H
#define MAIN_H
#include "foo.h"
#include "some_header_file.h"
... // Some Code
#endif
Inside foo.h file also with Header guards.
#ifndef FOO_H
#define FOO_H
#include "some_header_file.h"
... // Some Code
#endif
As you can see main file have 2 headers one of them is dublicate. I have three questions:
Does Header guards prevent duplicate header files?
Does the compiler optimize it and removes it?
Is this a bad practice and the extra header file should be deleted from main file?
Does Header guards prevent duplicate header files?
Yes. The first encountered inclusion will bring the content of the header to the translation unit, and the header guard causes successive inclusions to be empty which prevents the content of the header from being duplicated. This is exactly the reason why header guards are used.
or is this a bad practice and the extra header file should be deleted from main file?
No, the duplicate inclusion is not a bad practice. If the "main" header depends on any declaration from "some_header_file.h", then "main" absolutely should include "some_header_file.h" directly, whether another header - even one included by "main" - also includes it or not.
Relying on a transitive inclusion would generally be a bad practice - i.e. in this case it may be bad to rely on the detail that "foo.h" includes "some_header_file.h" when including "foo.h" into "main". Such assumptions often can cause programs to break unexpectedly when they are modified. In this case, if "foo.h" was modified to no longer depend on "some_header_file.h", and that inclusion was removed, then that change would suddenly cause the assumption to fail, and "some_header_file.h" would no longer be included into "main" as a result of change that didn't involve "main" at all. That would be bad.
The main problem with repited includes is when two different files include each other. For example, if a.h include b.h, and b.h includes a.h, if you don't add header guards, preprocessor ends up in a cycle, because every time it reads a.h includes b.h, and b.h includes a.h, and this never ends.
In your case you could have some problem if you define a variable in your some_header_file.h", because it would be read twice and the variables would be declared twice also, which would result in a compiler error.
You need to add them in "some_header_file.h", that way next time preprocessor reads this file it would ignore it by the ifndef clause. And notice the importance for cyclic include dependencies.
In "some_header_file.h" add:
#ifndef SOME_HEADER_FILE_H
#define SOME_HEADER_FILE_H
...code
#endif /* SOME_HEADER_FILE_H */
Last comment isn't necessary, but it helps case you need to debug/review preprocessors output.

Base class undefined (CPP) [duplicate]

Two common questions about include guards:
FIRST QUESTION:
Why aren't include guards protecting my header files from mutual, recursive inclusion? I keep getting errors about non-existing symbols which are obviously there or even weirder syntax errors every time I write something like the following:
"a.h"
#ifndef A_H
#define A_H
#include "b.h"
...
#endif // A_H
"b.h"
#ifndef B_H
#define B_H
#include "a.h"
...
#endif // B_H
"main.cpp"
#include "a.h"
int main()
{
...
}
Why do I get errors compiling "main.cpp"? What do I need to do to solve my problem?
SECOND QUESTION:
Why aren't include guards preventing multiple definitions? For instance, when my project contains two files that include the same header, sometimes the linker complains about some symbol being defined multiple times. For instance:
"header.h"
#ifndef HEADER_H
#define HEADER_H
int f()
{
return 0;
}
#endif // HEADER_H
"source1.cpp"
#include "header.h"
...
"source2.cpp"
#include "header.h"
...
Why is this happening? What do I need to do to solve my problem?
FIRST QUESTION:
Why aren't include guards protecting my header files from mutual, recursive inclusion?
They are.
What they are not helping with is dependencies between the definitions of data structures in mutually-including headers. To see what this means, let's start with a basic scenario and see why include guards do help with mutual inclusions.
Suppose your mutually including a.h and b.h header files have trivial content, i.e. the ellipses in the code sections from the question's text are replaced with the empty string. In this situation, your main.cpp will happily compile. And this is only thanks to your include guards!
If you're not convinced, try removing them:
//================================================
// a.h
#include "b.h"
//================================================
// b.h
#include "a.h"
//================================================
// main.cpp
//
// Good luck getting this to compile...
#include "a.h"
int main()
{
...
}
You'll notice that the compiler will report a failure when it reaches the inclusion depth limit. This limit is implementation-specific. Per Paragraph 16.2/6 of the C++11 Standard:
A #include preprocessing directive may appear in a source file that has been read because of a #include directive in another file, up to an implementation-defined nesting limit.
So what's going on?
When parsing main.cpp, the preprocessor will meet the directive #include "a.h". This directive tells the preprocessor to process the header file a.h, take the result of that processing, and replace the string #include "a.h" with that result;
While processing a.h, the preprocessor will meet the directive #include "b.h", and the same mechanism applies: the preprocessor shall process the header file b.h, take the result of its processing, and replace the #include directive with that result;
When processing b.h, the directive #include "a.h" will tell the preprocessor to process a.h and replace that directive with the result;
The preprocessor will start parsing a.h again, will meet the #include "b.h" directive again, and this will set up a potentially infinite recursive process. When reaching the critical nesting level, the compiler will report an error.
When include guards are present, however, no infinite recursion will be set up in step 4. Let's see why:
(same as before) When parsing main.cpp, the preprocessor will meet the directive #include "a.h". This tells the preprocessor to process the header file a.h, take the result of that processing, and replace the string #include "a.h" with that result;
While processing a.h, the preprocessor will meet the directive #ifndef A_H. Since the macro A_H has not yet been defined, it will keep processing the following text. The subsequent directive (#defines A_H) defines the macro A_H. Then, the preprocessor will meet the directive #include "b.h": the preprocessor shall now process the header file b.h, take the result of its processing, and replace the #include directive with that result;
When processing b.h, the preprocessor will meet the directive #ifndef B_H. Since the macro B_H has not yet been defined, it will keep processing the following text. The subsequent directive (#defines B_H) defines the macro B_H. Then, the directive #include "a.h" will tell the preprocessor to process a.h and replace the #include directive in b.h with the result of preprocessing a.h;
The compiler will start preprocessing a.h again, and meet the #ifndef A_H directive again. However, during previous preprocessing, macro A_H has been defined. Therefore, the compiler will skip the following text this time until the matching #endif directive is found, and the output of this processing is the empty string (supposing nothing follows the #endif directive, of course). The preprocessor will therefore replace the #include "a.h" directive in b.h with the empty string, and will trace back the execution until it replaces the original #include directive in main.cpp.
Thus, include guards do protect against mutual inclusion. However, they can't help with dependencies between the definitions of your classes in mutually-including files:
//================================================
// a.h
#ifndef A_H
#define A_H
#include "b.h"
struct A
{
};
#endif // A_H
//================================================
// b.h
#ifndef B_H
#define B_H
#include "a.h"
struct B
{
A* pA;
};
#endif // B_H
//================================================
// main.cpp
//
// Good luck getting this to compile...
#include "a.h"
int main()
{
...
}
Given the above headers, main.cpp will not compile.
Why is this happening?
To see what's going on, it is enough to go through steps 1-4 again.
It is easy to see that the first three steps and most of the fourth step are unaffected by this change (just read through them to get convinced). However, something different happens at the end of step 4: after replacing the #include "a.h" directive in b.h with the empty string, the preprocessor will start parsing the content of b.h and, in particular, the definition of B. Unfortunately, the definition of B mentions class A, which has never been met before exactly because of the inclusion guards!
Declaring a member variable of a type which has not been previously declared is, of course, an error, and the compiler will politely point that out.
What do I need to do to solve my problem?
You need forward declarations.
In fact, the definition of class A is not required in order to define class B, because a pointer to A is being declared as a member variable, and not an object of type A. Since pointers have fixed size, the compiler won't need to know the exact layout of A nor to compute its size in order to properly define class B. Hence, it is enough to forward-declare class A in b.h and make the compiler aware of its existence:
//================================================
// b.h
#ifndef B_H
#define B_H
// Forward declaration of A: no need to #include "a.h"
struct A;
struct B
{
A* pA;
};
#endif // B_H
Your main.cpp will now certainly compile. A couple of remarks:
Not only breaking the mutual inclusion by replacing the #include directive with a forward declaration in b.h was enough to effectively express the dependency of B on A: using forward declarations whenever possible/practical is also considered to be a good programming practice, because it helps avoiding unnecessary inclusions, thus reducing the overall compilation time. However, after eliminating the mutual inclusion, main.cpp will have to be modified to #include both a.h and b.h (if the latter is needed at all), because b.h is no more indirectly #included through a.h;
While a forward declaration of class A is enough for the compiler to declare pointers to that class (or to use it in any other context where incomplete types are acceptable), dereferencing pointers to A (for instance to invoke a member function) or computing its size are illegal operations on incomplete types: if that is needed, the full definition of A needs to be available to the compiler, which means the header file that defines it must be included. This is why class definitions and the implementation of their member functions are usually split into a header file and an implementation file for that class (class templates are an exception to this rule): implementation files, which are never #included by other files in the project, can safely #include all the necessary headers to make definitions visible. Header files, on the other hand, won't #include other header files unless they really need to do so (for instance, to make the definition of a base class visible), and will use forward-declarations whenever possible/practical.
SECOND QUESTION:
Why aren't include guards preventing multiple definitions?
They are.
What they are not protecting you from is multiple definitions in separate translation units. This is also explained in this Q&A on StackOverflow.
Too see that, try removing the include guards and compiling the following, modified version of source1.cpp (or source2.cpp, for what it matters):
//================================================
// source1.cpp
//
// Good luck getting this to compile...
#include "header.h"
#include "header.h"
int main()
{
...
}
The compiler will certainly complain here about f() being redefined. That's obvious: its definition is being included twice! However, the above source1.cpp will compile without problems when header.h contains the proper include guards. That's expected.
Still, even when the include guards are present and the compiler will stop bothering you with error message, the linker will insist on the fact that multiple definitions being found when merging the object code obtained from the compilation of source1.cpp and source2.cpp, and will refuse to generate your executable.
Why is this happening?
Basically, each .cpp file (the technical term in this context is translation unit) in your project is compiled separately and independently. When parsing a .cpp file, the preprocessor will process all the #include directives and expand all macro invocations it encounters, and the output of this pure text processing will be given in input to the compiler for translating it into object code. Once the compiler is done with producing the object code for one translation unit, it will proceed with the next one, and all the macro definitions that have been encountered while processing the previous translation unit will be forgotten.
In fact, compiling a project with n translation units (.cpp files) is like executing the same program (the compiler) n times, each time with a different input: different executions of the same program won't share the state of the previous program execution(s). Thus, each translation is performed independently and the preprocessor symbols encountered while compiling one translation unit will not be remembered when compiling other translation units (if you think about it for a moment, you will easily realize that this is actually a desirable behavior).
Therefore, even though include guards help you preventing recursive mutual inclusions and redundant inclusions of the same header in one translation unit, they can't detect whether the same definition is included in different translation unit.
Yet, when merging the object code generated from the compilation of all the .cpp files of your project, the linker will see that the same symbol is defined more than once, and since this violates the One Definition Rule. Per Paragraph 3.2/3 of the C++11 Standard:
Every program shall contain exactly one definition of every non-inline function or variable that is odr-used in that program; no diagnostic required. The definition can appear explicitly in the program, it can be found in the standard or a user-defined library, or (when appropriate) it is implicitly defined (see 12.1, 12.4 and 12.8). An inline function shall be defined in every translation unit in which it is odr-used.
Hence, the linker will emit an error and refuse to generate the executable of your program.
What do I need to do to solve my problem?
If you want to keep your function definition in a header file that is #included by multiple translation units (notice, that no problem will arise if your header is #included just by one translation unit), you need to use the inline keyword.
Otherwise, you need to keep only the declaration of your function in header.h, putting its definition (body) into one separate .cpp file only (this is the classical approach).
The inline keyword represents a non-binding request to the compiler to inline the function's body directly at the call site, rather than setting up a stack frame for a regular function call. Although the compiler doesn't have to fulfill your request, the inline keyword does succeed in telling the linker to tolerate multiple symbol definitions. According to Paragraph 3.2/5 of the C++11 Standard:
There can be more than one definition of a class type (Clause 9), enumeration type (7.2), inline function with external linkage (7.1.2), class template (Clause 14), non-static function template (14.5.6), static data member of a class template (14.5.1.3), member function of a class template (14.5.1.1), or template specialization for which some template parameters are not specified (14.7, 14.5.5) in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements [...]
The above Paragraph basically lists all the definitions which are commonly put in header files, because they can be safely included in multiple translation units. All other definitions with external linkage, instead, belong in source files.
Using the static keyword instead of the inline keyword also results in suppressing linker errors by giving your function internal linkage, thus making each translation unit hold a private copy of that function (and of its local static variables). However, this eventually results in a larger executable, and the use of inline should be preferred in general.
An alternative way of achieving the same result as with the static keyword is to put function f() in an unnamed namespace. Per Paragraph 3.5/4 of the C++11 Standard:
An unnamed namespace or a namespace declared directly or indirectly within an unnamed namespace has internal linkage. All other namespaces have external linkage. A name having namespace scope that has not been given internal linkage above has the same linkage as the enclosing namespace if it is the name of:
— a variable; or
— a function; or
— a named class (Clause 9), or an unnamed class defined in a typedef declaration in which the class has the typedef name for linkage purposes (7.1.3); or
— a named enumeration (7.2), or an unnamed enumeration defined in a typedef declaration in which the enumeration has the typedef name for linkage purposes (7.1.3); or
— an enumerator belonging to an enumeration with linkage; or
— a template.
For the same reason mentioned above, the inline keyword should be preferred.
fiorentinoing's answer is echoed in Git 2.24 (Q4 2019), where a similar code cleanup is taking place in the Git codebase.
See commit 2fe4439 (03 Oct 2019) by René Scharfe (rscharfe).
(Merged by Junio C Hamano -- gitster -- in commit a4c5d9f, 11 Oct 2019)
treewide: remove duplicate #include directives
Found with:
git grep '^#include ' '*.c' | sort | uniq -d
First of all you should be 100% sure that you have no duplicates in "include guards".
With this command
grep -rh "#ifndef" * 2>&1 | uniq -c | sort -rn | awk '{print $1 " " $3}' | grep -v "^1\ "
you will 1) highlight all include guards, get unique row with counter per include name, sort the results, print only counter and include name and remove the ones that are really unique.
HINT: this is equivalent to get the list of duplicated include names

What to #include?

I understand if a.h includes b.h and I don't declare anything from b.h in my header including only a.h is good practice. If I were to declare anything from b.h in my header then I should include both a.h and b.h to make my header self-sufficient.
In my header I do declare both class A from a.h and class B from b.h. However, class A depends on class B, so I always use them together. I never use class B independently of class A. In this case does it still make sense to include both a.h and b.h?
a.h
#include "b.h"
#include <queue>
class A
{
private:
std::queue<B> mFoo;
}
In my actual code I think it makes my intention clearer when I include just my event system for example and not a number of includes that seem superfluous to me.
You should always include direct dependencies: if a depends directly on b and c, even if b already includes c, a should include both.
Be explicit about your dependencies. You should not rely on the fact that some other module already depends on the module you'd have to include. You never know how the dependency tree will change in the future.
In this regard, having implicit dependencies makes your code fragile.
To make sure you don't redefine something by including it twice, you can put header guard inside your header files:
#ifndef GRANDFATHER_H
#define GRANDFATHER_H
struct foo {
int member;
};
#endif /* GRANDFATHER_H */
You could also use
#pragma once
But not every compiler supports it.
That way even if the preprocessor will step on same include more than once, It won't re-include the code.
My rule: Given some random header foo.h, the following should compile clean, and preferably should link and run as well:
#include "foo.h"
int main () {}
Suppose foo.h doesn't contain any syntax errors (i.e., a source file that includes it in the context you intended compiles clean) but the above nonetheless doesn't compile. This almost always means there are some missing #include directives in foo.h.
What this doesn't tell you is whether all of the headers included in foo.h truly are needed. I have a handy little script that checks the above. If it passes, it creates a copy of foo.h and progressively comments out #include directives and sees if those modified versions of foo.h pass the above test. Any that do pass are indicative of suspect superfluous #include directives.
The problem is that the #include directives deemed to be superfluous might not be superfluous. Suppose foo.h defines class Foo, which has data members of type Bar and Baz (not pointers, members). The header foo.h correctly includes both bar.h and baz.h because that is where those two classes are defined. Suppose bar.h happens to include baz.h. As the author of foo.h, I don't care. I should still include both of those headers in foo.h because foo.h has direct dependencies on both of those other headers. My script will complain about suspect superfluous headers in this case. Those complaints are hints, not mandates.
On the other hand, fixing foo.h so that my simple test program above compiles clean is a mandate.

Why aren't my include guards preventing recursive inclusion and multiple symbol definitions?

Two common questions about include guards:
FIRST QUESTION:
Why aren't include guards protecting my header files from mutual, recursive inclusion? I keep getting errors about non-existing symbols which are obviously there or even weirder syntax errors every time I write something like the following:
"a.h"
#ifndef A_H
#define A_H
#include "b.h"
...
#endif // A_H
"b.h"
#ifndef B_H
#define B_H
#include "a.h"
...
#endif // B_H
"main.cpp"
#include "a.h"
int main()
{
...
}
Why do I get errors compiling "main.cpp"? What do I need to do to solve my problem?
SECOND QUESTION:
Why aren't include guards preventing multiple definitions? For instance, when my project contains two files that include the same header, sometimes the linker complains about some symbol being defined multiple times. For instance:
"header.h"
#ifndef HEADER_H
#define HEADER_H
int f()
{
return 0;
}
#endif // HEADER_H
"source1.cpp"
#include "header.h"
...
"source2.cpp"
#include "header.h"
...
Why is this happening? What do I need to do to solve my problem?
FIRST QUESTION:
Why aren't include guards protecting my header files from mutual, recursive inclusion?
They are.
What they are not helping with is dependencies between the definitions of data structures in mutually-including headers. To see what this means, let's start with a basic scenario and see why include guards do help with mutual inclusions.
Suppose your mutually including a.h and b.h header files have trivial content, i.e. the ellipses in the code sections from the question's text are replaced with the empty string. In this situation, your main.cpp will happily compile. And this is only thanks to your include guards!
If you're not convinced, try removing them:
//================================================
// a.h
#include "b.h"
//================================================
// b.h
#include "a.h"
//================================================
// main.cpp
//
// Good luck getting this to compile...
#include "a.h"
int main()
{
...
}
You'll notice that the compiler will report a failure when it reaches the inclusion depth limit. This limit is implementation-specific. Per Paragraph 16.2/6 of the C++11 Standard:
A #include preprocessing directive may appear in a source file that has been read because of a #include directive in another file, up to an implementation-defined nesting limit.
So what's going on?
When parsing main.cpp, the preprocessor will meet the directive #include "a.h". This directive tells the preprocessor to process the header file a.h, take the result of that processing, and replace the string #include "a.h" with that result;
While processing a.h, the preprocessor will meet the directive #include "b.h", and the same mechanism applies: the preprocessor shall process the header file b.h, take the result of its processing, and replace the #include directive with that result;
When processing b.h, the directive #include "a.h" will tell the preprocessor to process a.h and replace that directive with the result;
The preprocessor will start parsing a.h again, will meet the #include "b.h" directive again, and this will set up a potentially infinite recursive process. When reaching the critical nesting level, the compiler will report an error.
When include guards are present, however, no infinite recursion will be set up in step 4. Let's see why:
(same as before) When parsing main.cpp, the preprocessor will meet the directive #include "a.h". This tells the preprocessor to process the header file a.h, take the result of that processing, and replace the string #include "a.h" with that result;
While processing a.h, the preprocessor will meet the directive #ifndef A_H. Since the macro A_H has not yet been defined, it will keep processing the following text. The subsequent directive (#defines A_H) defines the macro A_H. Then, the preprocessor will meet the directive #include "b.h": the preprocessor shall now process the header file b.h, take the result of its processing, and replace the #include directive with that result;
When processing b.h, the preprocessor will meet the directive #ifndef B_H. Since the macro B_H has not yet been defined, it will keep processing the following text. The subsequent directive (#defines B_H) defines the macro B_H. Then, the directive #include "a.h" will tell the preprocessor to process a.h and replace the #include directive in b.h with the result of preprocessing a.h;
The compiler will start preprocessing a.h again, and meet the #ifndef A_H directive again. However, during previous preprocessing, macro A_H has been defined. Therefore, the compiler will skip the following text this time until the matching #endif directive is found, and the output of this processing is the empty string (supposing nothing follows the #endif directive, of course). The preprocessor will therefore replace the #include "a.h" directive in b.h with the empty string, and will trace back the execution until it replaces the original #include directive in main.cpp.
Thus, include guards do protect against mutual inclusion. However, they can't help with dependencies between the definitions of your classes in mutually-including files:
//================================================
// a.h
#ifndef A_H
#define A_H
#include "b.h"
struct A
{
};
#endif // A_H
//================================================
// b.h
#ifndef B_H
#define B_H
#include "a.h"
struct B
{
A* pA;
};
#endif // B_H
//================================================
// main.cpp
//
// Good luck getting this to compile...
#include "a.h"
int main()
{
...
}
Given the above headers, main.cpp will not compile.
Why is this happening?
To see what's going on, it is enough to go through steps 1-4 again.
It is easy to see that the first three steps and most of the fourth step are unaffected by this change (just read through them to get convinced). However, something different happens at the end of step 4: after replacing the #include "a.h" directive in b.h with the empty string, the preprocessor will start parsing the content of b.h and, in particular, the definition of B. Unfortunately, the definition of B mentions class A, which has never been met before exactly because of the inclusion guards!
Declaring a member variable of a type which has not been previously declared is, of course, an error, and the compiler will politely point that out.
What do I need to do to solve my problem?
You need forward declarations.
In fact, the definition of class A is not required in order to define class B, because a pointer to A is being declared as a member variable, and not an object of type A. Since pointers have fixed size, the compiler won't need to know the exact layout of A nor to compute its size in order to properly define class B. Hence, it is enough to forward-declare class A in b.h and make the compiler aware of its existence:
//================================================
// b.h
#ifndef B_H
#define B_H
// Forward declaration of A: no need to #include "a.h"
struct A;
struct B
{
A* pA;
};
#endif // B_H
Your main.cpp will now certainly compile. A couple of remarks:
Not only breaking the mutual inclusion by replacing the #include directive with a forward declaration in b.h was enough to effectively express the dependency of B on A: using forward declarations whenever possible/practical is also considered to be a good programming practice, because it helps avoiding unnecessary inclusions, thus reducing the overall compilation time. However, after eliminating the mutual inclusion, main.cpp will have to be modified to #include both a.h and b.h (if the latter is needed at all), because b.h is no more indirectly #included through a.h;
While a forward declaration of class A is enough for the compiler to declare pointers to that class (or to use it in any other context where incomplete types are acceptable), dereferencing pointers to A (for instance to invoke a member function) or computing its size are illegal operations on incomplete types: if that is needed, the full definition of A needs to be available to the compiler, which means the header file that defines it must be included. This is why class definitions and the implementation of their member functions are usually split into a header file and an implementation file for that class (class templates are an exception to this rule): implementation files, which are never #included by other files in the project, can safely #include all the necessary headers to make definitions visible. Header files, on the other hand, won't #include other header files unless they really need to do so (for instance, to make the definition of a base class visible), and will use forward-declarations whenever possible/practical.
SECOND QUESTION:
Why aren't include guards preventing multiple definitions?
They are.
What they are not protecting you from is multiple definitions in separate translation units. This is also explained in this Q&A on StackOverflow.
Too see that, try removing the include guards and compiling the following, modified version of source1.cpp (or source2.cpp, for what it matters):
//================================================
// source1.cpp
//
// Good luck getting this to compile...
#include "header.h"
#include "header.h"
int main()
{
...
}
The compiler will certainly complain here about f() being redefined. That's obvious: its definition is being included twice! However, the above source1.cpp will compile without problems when header.h contains the proper include guards. That's expected.
Still, even when the include guards are present and the compiler will stop bothering you with error message, the linker will insist on the fact that multiple definitions being found when merging the object code obtained from the compilation of source1.cpp and source2.cpp, and will refuse to generate your executable.
Why is this happening?
Basically, each .cpp file (the technical term in this context is translation unit) in your project is compiled separately and independently. When parsing a .cpp file, the preprocessor will process all the #include directives and expand all macro invocations it encounters, and the output of this pure text processing will be given in input to the compiler for translating it into object code. Once the compiler is done with producing the object code for one translation unit, it will proceed with the next one, and all the macro definitions that have been encountered while processing the previous translation unit will be forgotten.
In fact, compiling a project with n translation units (.cpp files) is like executing the same program (the compiler) n times, each time with a different input: different executions of the same program won't share the state of the previous program execution(s). Thus, each translation is performed independently and the preprocessor symbols encountered while compiling one translation unit will not be remembered when compiling other translation units (if you think about it for a moment, you will easily realize that this is actually a desirable behavior).
Therefore, even though include guards help you preventing recursive mutual inclusions and redundant inclusions of the same header in one translation unit, they can't detect whether the same definition is included in different translation unit.
Yet, when merging the object code generated from the compilation of all the .cpp files of your project, the linker will see that the same symbol is defined more than once, and since this violates the One Definition Rule. Per Paragraph 3.2/3 of the C++11 Standard:
Every program shall contain exactly one definition of every non-inline function or variable that is odr-used in that program; no diagnostic required. The definition can appear explicitly in the program, it can be found in the standard or a user-defined library, or (when appropriate) it is implicitly defined (see 12.1, 12.4 and 12.8). An inline function shall be defined in every translation unit in which it is odr-used.
Hence, the linker will emit an error and refuse to generate the executable of your program.
What do I need to do to solve my problem?
If you want to keep your function definition in a header file that is #included by multiple translation units (notice, that no problem will arise if your header is #included just by one translation unit), you need to use the inline keyword.
Otherwise, you need to keep only the declaration of your function in header.h, putting its definition (body) into one separate .cpp file only (this is the classical approach).
The inline keyword represents a non-binding request to the compiler to inline the function's body directly at the call site, rather than setting up a stack frame for a regular function call. Although the compiler doesn't have to fulfill your request, the inline keyword does succeed in telling the linker to tolerate multiple symbol definitions. According to Paragraph 3.2/5 of the C++11 Standard:
There can be more than one definition of a class type (Clause 9), enumeration type (7.2), inline function with external linkage (7.1.2), class template (Clause 14), non-static function template (14.5.6), static data member of a class template (14.5.1.3), member function of a class template (14.5.1.1), or template specialization for which some template parameters are not specified (14.7, 14.5.5) in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements [...]
The above Paragraph basically lists all the definitions which are commonly put in header files, because they can be safely included in multiple translation units. All other definitions with external linkage, instead, belong in source files.
Using the static keyword instead of the inline keyword also results in suppressing linker errors by giving your function internal linkage, thus making each translation unit hold a private copy of that function (and of its local static variables). However, this eventually results in a larger executable, and the use of inline should be preferred in general.
An alternative way of achieving the same result as with the static keyword is to put function f() in an unnamed namespace. Per Paragraph 3.5/4 of the C++11 Standard:
An unnamed namespace or a namespace declared directly or indirectly within an unnamed namespace has internal linkage. All other namespaces have external linkage. A name having namespace scope that has not been given internal linkage above has the same linkage as the enclosing namespace if it is the name of:
— a variable; or
— a function; or
— a named class (Clause 9), or an unnamed class defined in a typedef declaration in which the class has the typedef name for linkage purposes (7.1.3); or
— a named enumeration (7.2), or an unnamed enumeration defined in a typedef declaration in which the enumeration has the typedef name for linkage purposes (7.1.3); or
— an enumerator belonging to an enumeration with linkage; or
— a template.
For the same reason mentioned above, the inline keyword should be preferred.
fiorentinoing's answer is echoed in Git 2.24 (Q4 2019), where a similar code cleanup is taking place in the Git codebase.
See commit 2fe4439 (03 Oct 2019) by René Scharfe (rscharfe).
(Merged by Junio C Hamano -- gitster -- in commit a4c5d9f, 11 Oct 2019)
treewide: remove duplicate #include directives
Found with:
git grep '^#include ' '*.c' | sort | uniq -d
First of all you should be 100% sure that you have no duplicates in "include guards".
With this command
grep -rh "#ifndef" * 2>&1 | uniq -c | sort -rn | awk '{print $1 " " $3}' | grep -v "^1\ "
you will 1) highlight all include guards, get unique row with counter per include name, sort the results, print only counter and include name and remove the ones that are really unique.
HINT: this is equivalent to get the list of duplicated include names

How are circular #includes resolved?

In c lets say we have 2 files
1.h
#include<2.h>
blah blah
and we have
2.h
#include<1.h>
code
How is this resolved??
Typically you protect your include file with an ifndef/define that corresponds to the file name. This doesn't prevent the file from being included again, but it does prevent the contents (inside the ifndef) from being used and triggering the recursive includes again.
#ifndef HEADER_1_h
#define HEADER_1_h
#include "2.h"
/// rest of 1.h
#endif
#ifndef HEADER_2_h
#define HEADER_2_h
#include "1.h"
// rest of 2.h
#endif
Okay, for the sake of completeness I'll begin by quoting tvanfosson's answer:
You should use include guards:
// header1.hpp
#ifndef MYPROJECT_HEADER1_HPP_INCLUDED
#define MYPROJECT_HEADER1_HPP_INCLUDED
/// Put your stuff here
#endif // MYPROJECT_HEADER1_HPP_INCLUDED
However include guards are not meant to solve circular dependencies issues, they are meant to prevent multiple inclusions, which is quite different.
base.h
/ \
header1.h header2.h
\ /
class.cpp
In this case (quite common), you only want base.h to be included once, and that's what include guards give you.
So this will effectively prevents the double inclusion... but you won't be able to compile!!
The problem can be illustrated by trying to reason as the compiler does:
Include "header1.hpp" > this defines the include guard
Include "header2.hpp
Try to include "header1.hpp" but doesn't because the include guard is already defined
Cannot correctly parse "header2.hpp" because the types coming from "header1.hpp" have not been defined yet (since it was skipped)
Back to "header1.hpp", and the types from "header2.hpp" are still missing since they could not be compiled, and thus it fails here too
In the end, you'll left with a big pile of error messages, but at least the compiler does not crash.
The solution is to somehow remove the need for this circular dependency.
Use forward declarations if possible
Split "header1.h" into 2 parts: the part independent from header2 and the other, you should only need to include the former in header2
And THIS will solve the circular dependency (manually) there is no magic going on in the compiler that will do it for you.
Since you posted your question under the c++ tag as well as c, then I am assuming you are using c++. In c++, you can also use the #pragma once compiler directive:
1.h:
#pragma once
#include "2.h"
/// rest of 1.h
2.h:
#pragma once
#include "1.h"
/// rest of 2.h
The result is the same. But there are some notes:
pragma once will usually compile a little faster since it is a higher level mechanism and doesn't happen at preprocessing like include guards
Some compilers have known bugs "dealing" with #pragma once. Though most if not all modern compilers will handle it correctly. For a detailed list see Wikipedia
Edit: I think the directive is also supported in c compilers but have never tried it, and besides, in most c programs I've seen, include guards are the standard (maybe due to compiler limitations handling the pragma once directive?)
Circular inclusions must be eliminated, not "resolved" with include guards (as the accepted answer suggests). Consider this:
1.h:
#ifndef HEADER_1_h
#define HEADER_1_h
#include "2.h"
#define A 1
#define B 2
#endif // HEADER_1_h
2.h:
#ifndef HEADER_2_h
#define HEADER_2_h
#include "1.h"
#if (A == B)
#error Impossible
#endif
#endif // HEADER_2_h
main.c:
#include "1.h"
This will throw the "Impossible" error at compile time, because "2.h" fails to include "1.h" due to include guards, and both A and B become 0. In practice, this leads to hard to track errors which appear and disappear depending on the order in which header files are included.
The right solution here would be to move A and B to "common.h" which then could be included in both "1.h" and "2.h".