Related
At this link, the following was mentioned:
add.cpp:
int add(int x, int y)
{
return x + y;
}
main.cpp:
#include <iostream>
int add(int x, int y); // forward declaration using function prototype
int main()
{
using namespace std;
cout << "The sum of 3 and 4 is " << add(3, 4) << endl;
return 0;
}
We used a forward declaration so that the compiler would know what "add" was when compiling main.cpp. As previously mentioned, writing forward declarations for every function you want to use that lives in another file can get tedious quickly.
Can you explain "forward declaration" further? What is the problem if we use it in the main function?
Why forward-declare is necessary in C++
The compiler wants to ensure you haven't made spelling mistakes or passed the wrong number of arguments to the function. So, it insists that it first sees a declaration of 'add' (or any other types, classes, or functions) before it is used.
This really just allows the compiler to do a better job of validating the code and allows it to tidy up loose ends so it can produce a neat-looking object file. If you didn't have to forward declare things, the compiler would produce an object file that would have to contain information about all the possible guesses as to what the function add might be. And the linker would have to contain very clever logic to try and work out which add you actually intended to call, when the add function may live in a different object file the linker is joining with the one that uses add to produce a dll or exe. It's possible that the linker may get the wrong add. Say you wanted to use int add(int a, float b), but accidentally forgot to write it, but the linker found an already existing int add(int a, int b) and thought that was the right one and used that instead. Your code would compile, but wouldn't be doing what you expected.
So, just to keep things explicit and avoid guessing, etc, the compiler insists you declare everything before it is used.
Difference between declaration and definition
As an aside, it's important to know the difference between a declaration and a definition. A declaration just gives enough code to show what something looks like, so for a function, this is the return type, calling convention, method name, arguments, and their types. However, the code for the method isn't required. For a definition, you need the declaration and then also the code for the function too.
How forward-declarations can significantly reduce build times
You can get the declaration of a function into your current .cpp or .h file by #includ'ing the header that already contains a declaration of the function. However, this can slow down your compile, especially if you #include a header into a .h instead of .cpp of your program, as everything that #includes the .h you're writing would end up #include'ing all the headers you wrote #includes for too. Suddenly, the compiler has #included pages and pages of code that it needs to compile even when you only wanted to use one or two functions. To avoid this, you can use a forward-declaration and just type the declaration of the function yourself at the top of the file. If you're only using a few functions, this can really make your compiles quicker compared to always #including the header. For really large projects, the difference could be an hour or more of compile time bought down to a few minutes.
Break cyclic references where two definitions both use each other
Additionally, forward-declarations can help you break cycles. This is where two functions both try to use each other. When this happens (and it is a perfectly valid thing to do), you may #include one header file, but that header file tries to #include the header file you're currently writing... which then #includes the other header, which #includes the one you're writing. You're stuck in a chicken and egg situation with each header file trying to re #include the other. To solve this, you can forward-declare the parts you need in one of the files and leave the #include out of that file.
Eg:
File Car.h
#include "Wheel.h" // Include Wheel's definition so it can be used in Car.
#include <vector>
class Car
{
std::vector<Wheel> wheels;
};
File Wheel.h
Hmm... the declaration of Car is required here as Wheel has a pointer to a Car, but Car.h can't be included here as it would result in a compiler error. If Car.h was included, that would then try to include Wheel.h which would include Car.h which would include Wheel.h and this would go on forever, so instead the compiler raises an error. The solution is to forward declare Car instead:
class Car; // forward declaration
class Wheel
{
Car* car;
};
If class Wheel had methods which need to call methods of Car, those methods could be defined in Wheel.cpp and Wheel.cpp is now able to include Car.h without causing a cycle.
The compiler looks for each symbol being used in the current translation unit is previously declared or not in the current unit. It is just a matter of style providing all method signatures at the beginning of a source file while definitions are provided later. The significant use of it is when you use a pointer to a class as member variable of another class.
//foo.h
class bar; // This is useful
class foo
{
bar* obj; // Pointer or even a reference.
};
// foo.cpp
#include "bar.h"
#include "foo.h"
So, use forward-declarations in classes when ever possible. If your program just has functions( with ho header files), then providing prototypes at the beginning is just a matter of style. This would be anyhow the case had if the header file was present in a normal program with header that has only functions.
Because C++ is parsed from the top down, the compiler needs to know about things before they are used. So, when you reference:
int add( int x, int y )
in the main function the compiler needs to know it exists. To prove this try moving it to below the main function and you'll get a compiler error.
So a 'Forward Declaration' is just what it says on the tin. It's declaring something in advance of its use.
Generally you would include forward declarations in a header file and then include that header file in the same way that iostream is included.
The term "forward declaration" in C++ is mostly only used for class declarations. See (the end of) this answer for why a "forward declaration" of a class really is just a simple class declaration with a fancy name.
In other words, the "forward" just adds ballast to the term, as any declaration can be seen as being forward in so far as it declares some identifier before it is used.
(As to what is a declaration as opposed to a definition, again see What is the difference between a definition and a declaration?)
When the compiler sees add(3, 4) it needs to know what that means. With the forward declaration you basically tell the compiler that add is a function that takes two ints and returns an int. This is important information for the compiler becaus it needs to put 4 and 5 in the correct representation onto the stack and needs to know what type the thing returned by add is.
At that time, the compiler is not worried about the actual implementation of add, ie where it is (or if there is even one) and if it compiles. That comes into view later, after compiling the source files when the linker is invoked.
one quick addendum regarding: usually you put those forward references into a header file belonging to the .c(pp) file where the function/variable etc. is implemented. in your example it would look like this:
add.h:
extern int add(int a, int b);
the keyword extern states that the function is actually declared in an external file (could also be a library etc.).
your main.c would look like this:
#include
#include "add.h"
int main()
{
.
.
.
int add(int x, int y); // forward declaration using function prototype
Can you explain "forward declaration"
more further? What is the problem if
we use it in the main() function?
It's same as #include"add.h". If you know,preprocessor expands the file which you mention in #include, in the .cpp file where you write the #include directive. That means, if you write #include"add.h", you get the same thing, it is as if you doing "forward declaration".
I'm assuming that add.h has this line:
int add(int x, int y);
One problem is, that the compiler does not know, which kind of value is delivered by your function; is assumes, that the function returns an int in this case, but this can be as correct as it can be wrong. Another problem is, that the compiler does not know, which kind of arguments your function expects, and cannot warn you, if you are passing values of the wrong kind. There are special "promotion" rules, which apply when passing, say floating point values to an undeclared function (the compiler has to widen them to type double), which is often not, what the function actually expects, leading to hard to find bugs at run-time.
At this link, the following was mentioned:
add.cpp:
int add(int x, int y)
{
return x + y;
}
main.cpp:
#include <iostream>
int add(int x, int y); // forward declaration using function prototype
int main()
{
using namespace std;
cout << "The sum of 3 and 4 is " << add(3, 4) << endl;
return 0;
}
We used a forward declaration so that the compiler would know what "add" was when compiling main.cpp. As previously mentioned, writing forward declarations for every function you want to use that lives in another file can get tedious quickly.
Can you explain "forward declaration" further? What is the problem if we use it in the main function?
Why forward-declare is necessary in C++
The compiler wants to ensure you haven't made spelling mistakes or passed the wrong number of arguments to the function. So, it insists that it first sees a declaration of 'add' (or any other types, classes, or functions) before it is used.
This really just allows the compiler to do a better job of validating the code and allows it to tidy up loose ends so it can produce a neat-looking object file. If you didn't have to forward declare things, the compiler would produce an object file that would have to contain information about all the possible guesses as to what the function add might be. And the linker would have to contain very clever logic to try and work out which add you actually intended to call, when the add function may live in a different object file the linker is joining with the one that uses add to produce a dll or exe. It's possible that the linker may get the wrong add. Say you wanted to use int add(int a, float b), but accidentally forgot to write it, but the linker found an already existing int add(int a, int b) and thought that was the right one and used that instead. Your code would compile, but wouldn't be doing what you expected.
So, just to keep things explicit and avoid guessing, etc, the compiler insists you declare everything before it is used.
Difference between declaration and definition
As an aside, it's important to know the difference between a declaration and a definition. A declaration just gives enough code to show what something looks like, so for a function, this is the return type, calling convention, method name, arguments, and their types. However, the code for the method isn't required. For a definition, you need the declaration and then also the code for the function too.
How forward-declarations can significantly reduce build times
You can get the declaration of a function into your current .cpp or .h file by #includ'ing the header that already contains a declaration of the function. However, this can slow down your compile, especially if you #include a header into a .h instead of .cpp of your program, as everything that #includes the .h you're writing would end up #include'ing all the headers you wrote #includes for too. Suddenly, the compiler has #included pages and pages of code that it needs to compile even when you only wanted to use one or two functions. To avoid this, you can use a forward-declaration and just type the declaration of the function yourself at the top of the file. If you're only using a few functions, this can really make your compiles quicker compared to always #including the header. For really large projects, the difference could be an hour or more of compile time bought down to a few minutes.
Break cyclic references where two definitions both use each other
Additionally, forward-declarations can help you break cycles. This is where two functions both try to use each other. When this happens (and it is a perfectly valid thing to do), you may #include one header file, but that header file tries to #include the header file you're currently writing... which then #includes the other header, which #includes the one you're writing. You're stuck in a chicken and egg situation with each header file trying to re #include the other. To solve this, you can forward-declare the parts you need in one of the files and leave the #include out of that file.
Eg:
File Car.h
#include "Wheel.h" // Include Wheel's definition so it can be used in Car.
#include <vector>
class Car
{
std::vector<Wheel> wheels;
};
File Wheel.h
Hmm... the declaration of Car is required here as Wheel has a pointer to a Car, but Car.h can't be included here as it would result in a compiler error. If Car.h was included, that would then try to include Wheel.h which would include Car.h which would include Wheel.h and this would go on forever, so instead the compiler raises an error. The solution is to forward declare Car instead:
class Car; // forward declaration
class Wheel
{
Car* car;
};
If class Wheel had methods which need to call methods of Car, those methods could be defined in Wheel.cpp and Wheel.cpp is now able to include Car.h without causing a cycle.
The compiler looks for each symbol being used in the current translation unit is previously declared or not in the current unit. It is just a matter of style providing all method signatures at the beginning of a source file while definitions are provided later. The significant use of it is when you use a pointer to a class as member variable of another class.
//foo.h
class bar; // This is useful
class foo
{
bar* obj; // Pointer or even a reference.
};
// foo.cpp
#include "bar.h"
#include "foo.h"
So, use forward-declarations in classes when ever possible. If your program just has functions( with ho header files), then providing prototypes at the beginning is just a matter of style. This would be anyhow the case had if the header file was present in a normal program with header that has only functions.
Because C++ is parsed from the top down, the compiler needs to know about things before they are used. So, when you reference:
int add( int x, int y )
in the main function the compiler needs to know it exists. To prove this try moving it to below the main function and you'll get a compiler error.
So a 'Forward Declaration' is just what it says on the tin. It's declaring something in advance of its use.
Generally you would include forward declarations in a header file and then include that header file in the same way that iostream is included.
The term "forward declaration" in C++ is mostly only used for class declarations. See (the end of) this answer for why a "forward declaration" of a class really is just a simple class declaration with a fancy name.
In other words, the "forward" just adds ballast to the term, as any declaration can be seen as being forward in so far as it declares some identifier before it is used.
(As to what is a declaration as opposed to a definition, again see What is the difference between a definition and a declaration?)
When the compiler sees add(3, 4) it needs to know what that means. With the forward declaration you basically tell the compiler that add is a function that takes two ints and returns an int. This is important information for the compiler becaus it needs to put 4 and 5 in the correct representation onto the stack and needs to know what type the thing returned by add is.
At that time, the compiler is not worried about the actual implementation of add, ie where it is (or if there is even one) and if it compiles. That comes into view later, after compiling the source files when the linker is invoked.
one quick addendum regarding: usually you put those forward references into a header file belonging to the .c(pp) file where the function/variable etc. is implemented. in your example it would look like this:
add.h:
extern int add(int a, int b);
the keyword extern states that the function is actually declared in an external file (could also be a library etc.).
your main.c would look like this:
#include
#include "add.h"
int main()
{
.
.
.
int add(int x, int y); // forward declaration using function prototype
Can you explain "forward declaration"
more further? What is the problem if
we use it in the main() function?
It's same as #include"add.h". If you know,preprocessor expands the file which you mention in #include, in the .cpp file where you write the #include directive. That means, if you write #include"add.h", you get the same thing, it is as if you doing "forward declaration".
I'm assuming that add.h has this line:
int add(int x, int y);
One problem is, that the compiler does not know, which kind of value is delivered by your function; is assumes, that the function returns an int in this case, but this can be as correct as it can be wrong. Another problem is, that the compiler does not know, which kind of arguments your function expects, and cannot warn you, if you are passing values of the wrong kind. There are special "promotion" rules, which apply when passing, say floating point values to an undeclared function (the compiler has to widen them to type double), which is often not, what the function actually expects, leading to hard to find bugs at run-time.
When I started learning C++ I learned that header files should typically be #included in other header files. Just now I had someone tell me that I should include a specific header file in my .cpp file to avoid header include creep.
Could someone tell me what exactly that is, why it is a problem and maybe point me to some documentation on when I would want to include a file in another header and when I should include one in a .cpp file?
The "creep" refers to the inclusion of one header including many others. This has some unwanted consequences:
a tendency towards every source file indirectly including every header, so that a change to any header requires everything to be recompiled;
more likelihood of circular dependencies causing heartache.
You can often avoid including one header from another by just declaring the classes you need, rather than including the header that gives the complete definition. Such an incomplete type can be used in various ways:
// Don't need this
//#include "thingy.h"
// just this
class thingy;
class whatsit {
// You can declare pointers and references
thingy * p;
thingy & r;
// You can declare static member variables (and external globals)
// They will need the header in the source file that defines them
static thingy s;
// You can declare (but not define) functions with incomplete
// parameter or return types
thingy f(thingy);
};
Some things do require the complete definition:
// Do need this
#include "thingy.h"
// Needed for inheritance
class whatsit : thingy {
// Needed for non-static member variables
thingy t;
// Needed to do many things like copying, accessing members, etc
thingy f() {return t;}
};
Could someone tell me what exactly [include creep] is
It's not a programming term, but interpreting it in an Engish-language context would imply that it's the introduction of #include statements that are not necessary.
why it is a problem
Because compiling code takes time. So compiling code that's not necessary takes unnecessary time.
and maybe point me to some documentation on when I would want to include a file in another header and when I should include one in a .cpp file?
If your header requires the definition of a type, you will need to include the header that defines that type.
#include "type.h"
struct TypeHolder
{
Type t;
// ^^^^ this value type requires the definition to know its size.
}
If your header only requires the declaration of a type, the definition is unnecessary, and so is the include.
class Type;
struct TypeHolder
{
Type * t;
// ^^^^^^ this pointer type has the already-known size of a pointer,
// so the definition is not required.
}
As an anecdote that supports the value of this practice, I was once put on a project whose codebase required ONE HOUR to fully compile. And changing a header often incurred most or all of that hour for the next compilation.
Adding forward-declarations to the headers where applicable instead of includes immediately reduced full compilation time to 16 minutes, and changing a header often didn't require a full rebuild.
Well they were probably referring to a couple of things ("include creep" is not a term I've ever heard before). If you, as an extreme rule, only include headers from headers (aside from one matching header per source file):
Compilation time increases as many headers must be compiled for all source files.
Unnecessary dependencies increase, e.g. with a dependency based build system, modifying one header may cause unnecessary recompilation of many other files, increasing compile time.
Increased possibility of namespace pollution.
Circular dependencies become an issue (two headers that need to include each other).
Other more advanced dependency-related topics (for example, this defeats the purpose of opaque pointers, which causes many design issues in certain types of applications).
As a general rule of thumb, include the bare minimum number of things from a header required to compile only the code in that header (that is, enough to allow the header to compile if it is included by itself, but no more than that). This will combat all of the above issues.
For example, if you have a source file for a class that uses std::list in the code, but the class itself has no members or function parameters that use std::list, there's no reason to #include <list> from the class's header, as nothing in the class's header uses std::list. Do it from the source file, where you actually need it, instead of in the header, where everything that uses the class also has to compile <list> unnecessarily.
Another common technique is forward declaring pointer types. This is the only real way to combat circular dependency issues, and has some of the compilation time benefits listed above as well. For example, if two classes refer to each other but only via pointers:
// A.h
class B; // forward declaration instead of #include "B.h"
class A {
B *b;
}
// B.h
class A; // forward declaration instead of #include "A.h"
class B {
A *a;
}
Then include the headers for each in the source files, where the definitions are actually needed.
Your header files should include all headers they need to compile when included alone (or first) in a source file. You may in many cases allow the header to compile by using forward declarations, but you should never rely on someone including a specific header before they include yours.
theres compie time to think of, but there's also dependancies. if A.h includes B.h and viceversa, code won't compile. You can get around this by forward referancing your class, then including the header in the cpp
A.h
class B;
class A
{
public:
void CreateNewB():
private:
B* m_b;
};
A.cpp
#include "B.h"
A::CreateNewB()
{
m_b = new B;
}
Thinking Time - Why do you want to split your file anyway?
As the title suggests, the end problem I have is multiple definition linker errors. I have actually fixed the problem, but I haven't fixed the problem in the correct way. Before starting I want to discuss the reasons for splitting a class file into multiple files. I have tried to put all the possible scenarios here - if I missed any, please remind me and I can make changes. Hopefully the following are correct:
Reason 1 To save space:
You have a file containing the declaration of a class with all class members. You place #include guards around this file (or #pragma once) to ensure no conflicts arise if you #include the file in two different header files which are then included in a source file. You compile a separate source file with the implementation of any methods declared in this class, as it offloads many lines of code from your source file, which cleans things up a bit and introduces some order to your program.
Example: As you can see, the below example could be improved by splitting the implementation of the class methods into a different file. (A .cpp file)
// my_class.hpp
#pragma once
class my_class
{
public:
void my_function()
{
// LOTS OF CODE
// CONFUSING TO DEBUG
// LOTS OF CODE
// DISORGANIZED AND DISTRACTING
// LOTS OF CODE
// LOOKS HORRIBLE
// LOTS OF CODE
// VERY MESSY
// LOTS OF CODE
}
// MANY OTHER METHODS
// MEANS VERY LARGE FILE WITH LOTS OF LINES OF CODE
}
Reason 2 To prevent multiple definition linker errors:
Perhaps this is the main reason why you would split implementation from declaration. In the above example, you could move the method body to outside the class. This would make it look much cleaner and structured. However, according to this question, the above example has implicit inline specifiers. Moving the implementation from within the class to outside the class, as in the example below, will cause you linker errors, and so you would either inline everything, or move the function definitions to a .cpp file.
Example: _The example below will cause "multiple definition linker errors" if you do not move the function definition to a .cpp file or specify the function as inline.
// my_class.hpp
void my_class::my_function()
{
// ERROR! MULTIPLE DEFINITION OF my_class::my_function
// This error only occurs if you #include the file containing this code
// in two or more separate source (compiled, .cpp) files.
}
To fix the problem:
//my_class.cpp
void my_class::my_function()
{
// Now in a .cpp file, so no multiple definition error
}
Or:
// my_class.hpp
inline void my_class::my_function()
{
// Specified function as inline, so okay - note: back in header file!
// The very first example has an implicit `inline` specifier
}
Reason 3 You want to save space, again, but this time you are working with a template class:
If we are working with template classes, then we cannot move the implementation to a source file (.cpp file). That's not currently allowed by (I assume) either the standard or by current compilers. Unlike the first example of Reason 2, above, we are allowed to place the implementation in the header file. According to this question the reason is that template class methods also have implied inline specifiers. Is that correct? (It seems to make sense.) But nobody seemed to know on the question I have just referenced!
So, are the two examples below identical?
// some_header_file.hpp
#pragma once
// template class declaration goes here
class some_class
{
// Some code
};
// Example 1: NO INLINE SPECIFIER
template<typename T>
void some_class::class_method()
{
// Some code
}
// Example 2: INLINE specifier used
template<typename T>
inline void some_class::class_method()
{
// Some code
}
If you have a template class header file, which is becoming huge due to all the functions you have, then I believe you are allowed to move the function definitions to another header file (usually a .tpp file?) and then #include file.tpp at the end of your header file containing the class declaration. You must NOT include this file anywhere else, however, hence the .tpp rather than .hpp.
I assume you could also do this with the inline methods of a regular class? Is that allowed also?
Question Time
So I have made some statements above, most of which relate to the structuring of source files. I think everything I said was correct, because I did some basic research and "found out some stuff", but this is a question and so I don't know for sure.
What this boils down to, is how you would organize code within files. I think I have figured out a structure which will always work.
Here is what I have come up with. (This is my class code file organization/structure standard, if you like. Don't know if it will be very useful yet, that's the point of asking.)
1: Declare the class (template or otherwise) in a .hpp file, including all methods, friend functions and data.
2: At the bottom of the .hpp file, #include a .tpp file containing the implementation of any inline methods. Create the .tpp file and ensure all methods are specified to be inline.
3: All other members (non-inline functions, friend functions and static data) should be defined in a .cpp file, which #includes the .hpp file at the top to prevent errors like "class ABC has not been declared". Since everything in this file will have external linkage, the program will link correctly.
Do standards like this exist in industry? Will the standard I came up with work in all cases?
Your three points sound about right. That's the standard way to do things (although I've not seen .tpp extension before, usually it's .inl), although personally I just put inline functions at the bottom of header files rather than in a separate file.
Here is how I arrange my files. I omit the forward declare file for simple classes.
myclass-fwd.h
#pragma once
namespace NS
{
class MyClass;
}
myclass.h
#pragma once
#include "headers-needed-by-header"
#include "myclass-fwd.h"
namespace NS
{
class MyClass
{
..
};
}
myclass.cpp
#include "headers-needed-by-source"
#include "myclass.h"
namespace
{
void LocalFunc();
}
NS::MyClass::...
Replace pragma with header guards according to preference..
The reason for this approach is to reduce header dependencies, which slow down compile times in large projects. If you didn't know, you can forward declare a class to use as a pointer or reference. The full declaration is only needed when you construct, create or use members of the class.
This means another class which uses the class (takes parameters by pointer/reference) only has to include the fwd header in its own header. The full header is then included in the second class's source file. This greatly reduces the amount of unneeded rubbish you get when pulling in a big header, which pulls in another big header, which pulls in another...
The next tip is the unnamed namespace (sometimes called anonymous namespace). This can only appear in a source file and it is like a hidden namespace only visible to that file. You can place local functions, classes etc here which are only used by the the source file. This prevents name clashes if you create something with the same name in two different files. (Two local function F for example, may give linker errors).
The main reason to separate interface from implementation is so that you don't have to recompile all of your code when something in the implementation changes; you only have to recompile the source files that changed.
As for "Declare the class (template or otherwise)", a template is not a class. A template is a pattern for creating classes. More important, though, you define a class or a template in a header. The class definition includes declarations of its member functions, and non-inine member functions are defined in one or more source files. Inline member functions and all template functions should be defined in the header, by whatever combination of direct definitions and #include directives you prefer.
Do standards like this exist in industry?
Yes. Then again, coding standards that are rather different from the ones you expressed can also be found in industry. You are talking about coding standards, after all, and coding standards range from good to bad to ugly.
Will the standard I came up with work in all cases?
Absolutely not. For example,
template <typename T> class Foo {
public:
void some_method (T& arg);
...
};
Here, the definition of class template Foo doesn't know a thing about that template parameter T. What if, for some class template, the definitions of the methods vary depending on the template parameters? Your rule #2 just doesn't work here.
Another example: What if the corresponding source file is huge, a thousand lines long or longer? At times it makes sense to provide the implementation in multiple source files. Some standards go to the extreme of dictating one function per file (personal opinion: Yech!).
At the other extreme of a thousand-plus line long source file is a class that has no source files. The entire implementation is in the header. There's a lot to be said for header-only implementations. If nothing else, it simplifies, sometimes significantly, the linking problem.
At this link, the following was mentioned:
add.cpp:
int add(int x, int y)
{
return x + y;
}
main.cpp:
#include <iostream>
int add(int x, int y); // forward declaration using function prototype
int main()
{
using namespace std;
cout << "The sum of 3 and 4 is " << add(3, 4) << endl;
return 0;
}
We used a forward declaration so that the compiler would know what "add" was when compiling main.cpp. As previously mentioned, writing forward declarations for every function you want to use that lives in another file can get tedious quickly.
Can you explain "forward declaration" further? What is the problem if we use it in the main function?
Why forward-declare is necessary in C++
The compiler wants to ensure you haven't made spelling mistakes or passed the wrong number of arguments to the function. So, it insists that it first sees a declaration of 'add' (or any other types, classes, or functions) before it is used.
This really just allows the compiler to do a better job of validating the code and allows it to tidy up loose ends so it can produce a neat-looking object file. If you didn't have to forward declare things, the compiler would produce an object file that would have to contain information about all the possible guesses as to what the function add might be. And the linker would have to contain very clever logic to try and work out which add you actually intended to call, when the add function may live in a different object file the linker is joining with the one that uses add to produce a dll or exe. It's possible that the linker may get the wrong add. Say you wanted to use int add(int a, float b), but accidentally forgot to write it, but the linker found an already existing int add(int a, int b) and thought that was the right one and used that instead. Your code would compile, but wouldn't be doing what you expected.
So, just to keep things explicit and avoid guessing, etc, the compiler insists you declare everything before it is used.
Difference between declaration and definition
As an aside, it's important to know the difference between a declaration and a definition. A declaration just gives enough code to show what something looks like, so for a function, this is the return type, calling convention, method name, arguments, and their types. However, the code for the method isn't required. For a definition, you need the declaration and then also the code for the function too.
How forward-declarations can significantly reduce build times
You can get the declaration of a function into your current .cpp or .h file by #includ'ing the header that already contains a declaration of the function. However, this can slow down your compile, especially if you #include a header into a .h instead of .cpp of your program, as everything that #includes the .h you're writing would end up #include'ing all the headers you wrote #includes for too. Suddenly, the compiler has #included pages and pages of code that it needs to compile even when you only wanted to use one or two functions. To avoid this, you can use a forward-declaration and just type the declaration of the function yourself at the top of the file. If you're only using a few functions, this can really make your compiles quicker compared to always #including the header. For really large projects, the difference could be an hour or more of compile time bought down to a few minutes.
Break cyclic references where two definitions both use each other
Additionally, forward-declarations can help you break cycles. This is where two functions both try to use each other. When this happens (and it is a perfectly valid thing to do), you may #include one header file, but that header file tries to #include the header file you're currently writing... which then #includes the other header, which #includes the one you're writing. You're stuck in a chicken and egg situation with each header file trying to re #include the other. To solve this, you can forward-declare the parts you need in one of the files and leave the #include out of that file.
Eg:
File Car.h
#include "Wheel.h" // Include Wheel's definition so it can be used in Car.
#include <vector>
class Car
{
std::vector<Wheel> wheels;
};
File Wheel.h
Hmm... the declaration of Car is required here as Wheel has a pointer to a Car, but Car.h can't be included here as it would result in a compiler error. If Car.h was included, that would then try to include Wheel.h which would include Car.h which would include Wheel.h and this would go on forever, so instead the compiler raises an error. The solution is to forward declare Car instead:
class Car; // forward declaration
class Wheel
{
Car* car;
};
If class Wheel had methods which need to call methods of Car, those methods could be defined in Wheel.cpp and Wheel.cpp is now able to include Car.h without causing a cycle.
The compiler looks for each symbol being used in the current translation unit is previously declared or not in the current unit. It is just a matter of style providing all method signatures at the beginning of a source file while definitions are provided later. The significant use of it is when you use a pointer to a class as member variable of another class.
//foo.h
class bar; // This is useful
class foo
{
bar* obj; // Pointer or even a reference.
};
// foo.cpp
#include "bar.h"
#include "foo.h"
So, use forward-declarations in classes when ever possible. If your program just has functions( with ho header files), then providing prototypes at the beginning is just a matter of style. This would be anyhow the case had if the header file was present in a normal program with header that has only functions.
Because C++ is parsed from the top down, the compiler needs to know about things before they are used. So, when you reference:
int add( int x, int y )
in the main function the compiler needs to know it exists. To prove this try moving it to below the main function and you'll get a compiler error.
So a 'Forward Declaration' is just what it says on the tin. It's declaring something in advance of its use.
Generally you would include forward declarations in a header file and then include that header file in the same way that iostream is included.
The term "forward declaration" in C++ is mostly only used for class declarations. See (the end of) this answer for why a "forward declaration" of a class really is just a simple class declaration with a fancy name.
In other words, the "forward" just adds ballast to the term, as any declaration can be seen as being forward in so far as it declares some identifier before it is used.
(As to what is a declaration as opposed to a definition, again see What is the difference between a definition and a declaration?)
When the compiler sees add(3, 4) it needs to know what that means. With the forward declaration you basically tell the compiler that add is a function that takes two ints and returns an int. This is important information for the compiler becaus it needs to put 4 and 5 in the correct representation onto the stack and needs to know what type the thing returned by add is.
At that time, the compiler is not worried about the actual implementation of add, ie where it is (or if there is even one) and if it compiles. That comes into view later, after compiling the source files when the linker is invoked.
one quick addendum regarding: usually you put those forward references into a header file belonging to the .c(pp) file where the function/variable etc. is implemented. in your example it would look like this:
add.h:
extern int add(int a, int b);
the keyword extern states that the function is actually declared in an external file (could also be a library etc.).
your main.c would look like this:
#include
#include "add.h"
int main()
{
.
.
.
int add(int x, int y); // forward declaration using function prototype
Can you explain "forward declaration"
more further? What is the problem if
we use it in the main() function?
It's same as #include"add.h". If you know,preprocessor expands the file which you mention in #include, in the .cpp file where you write the #include directive. That means, if you write #include"add.h", you get the same thing, it is as if you doing "forward declaration".
I'm assuming that add.h has this line:
int add(int x, int y);
One problem is, that the compiler does not know, which kind of value is delivered by your function; is assumes, that the function returns an int in this case, but this can be as correct as it can be wrong. Another problem is, that the compiler does not know, which kind of arguments your function expects, and cannot warn you, if you are passing values of the wrong kind. There are special "promotion" rules, which apply when passing, say floating point values to an undeclared function (the compiler has to widen them to type double), which is often not, what the function actually expects, leading to hard to find bugs at run-time.