Okay so I recently learned how the compiler exactly works and what the "linker" is. From the tutorial videos I've watched I clearly saw that if I include more than once a declaration, to say:
void Log(const char* message);
I would get an error since I am declaring it more than once. But currently, as I am testing it, I've created a header file which contains that particular declaration and I've included it a couple of times in my Main compilation unit, as so:
#include "Log.h"
#include "Log.h"
I have removed the #pragma once statement, nor do I have header guards written, but my program still runs perfectly and without any problems. Since the videos are 2-3 years old, I thought maybe there has been an update, which alltogether removes the need of guards and pragmas, but I do not know for sure.
The tutorials you've seen are correct. You cannot have more than one definition of something unless you use special techniques.
In this case though you don't have a definition.
void Log(const char* message);
is a declaration and you are allowed to have multiples of those. If you change the code to
void Log(const char* message) {}
then you would have a function definition and will get an error.
I would get an error since I am declaring it more than once.
Re-declaration is generally allowed, as long as you don't mix different kinds of declarations with the same name. Following is perfectly legal C++, and always has been:
void Log(const char* message);
void Log(const char* message);
You may have been confused with the one definition rule, which disallows defining things more than once.
I have removed the #pragma once statement, nor do I have header guards written, but my program still runs perfectly and without any problems.
If your header doesn't define anything, then it doesn't need a header guard. It's however simpler to just conventionally always keep the guard so that there is no need to keep track of whether there are definitions or not.
Bonus answer: All definitions are also declarations. It is usually easy to distinguish definitions of classes and functions from forward declarations:
return_type function_name(argument_list); // not a definition of function
return_type function_name(argument_list) { ... } // is a definition of function
class class_name; // not a definition of class
class class_name { // is a definition of class
void member_function(); // not a definition of function
void inline_member_function() { ... }; // is a definition of function
};
void class_name::member_function() { ... } // is a definition of function
Distinguishing variable definitions is a bit harder. Always check the rules when unsure.
this a function forward declaration and you just let the compiler know that a function X will be defined later. in some resources you will find out it's written/said that multiple declaration isn't allowed, but i think cuz of the clean code approach, not a compiler issue. and your case, you just include the declaration twice, the same if you declared the function in two different header files and included both of them in a source file.
Cherno's tutorials?
I think its made crystal clear in the videos that you can't have multiple definitions of a function. The custom header files that you've created are basically chunks of code copy-pasted hence if they include different definitions of the same function or say class it will result in ambiguity and throw an error as expected.
Edit: The point that he wanted to make -
If you write those two same function definitions together in a file then obviously it will throw up an error due to ambiguity arising as I mentioned above, which is detected by the compiler, since its only in a single file.
But when you place those two same definitions in a different file, say your custom created header "log.h" then when you import them into your cpp file twice (or say you import them in another cpp file and build the solution like in visual studio) it will throw up a linker error as the linker is involved (multiple files - wherein the job of the linker is to link them into a combined executable) and it cannot select multiple definitions present in different files. Hence for this case you will get the multiple definitions/signature error. (And including pragmas suppress warnings)
A Solution to resolve that is making the functions static, so that they are defined internally or only for the file they are being compiled against. This makes it possible to have multiple function definitions of the same function in different files with no linking error. Another option is to make it in-line. These cases provide you with multiple definitions with no errors, otherwise it will throw up errors.
Related
I have what seems a relatively simple question, but one that keeps defying my efforts to understand it.
I apologise if it is a simple question, but like many simple questions, I can't seem to find a solid explanation anywhere.
With the below code:
/*foo.c*/
#include "bar.h"
int main() {
return(my_function(1,2));
}
/*bar.h*/
int my_function(int,int);
/*bar.c*/
#include "bar.h" /*is this necessary!?*/
int my_function(int x, int y) {
return(x+y);
}
Simply, is the second inclusion necessary? I don't understand why I keep seeing headers included in both source files. Surely if the function is declared in "foo.c" by including "bar.h," it does not need to be declared a second time in another linked source file (especially the one which actually defines it)??? A friend tried to explain to me that it didn't really matter for functions, but it did for structs, something which still eludes me! Help!
Is it simply for clarity, so that programmers can see which functions are being used externally?
I just don't get it!
Thanks!
In this particular case, it's unnecessary for the reason you described. It might be useful in situations where you have a more complex set of functions that might all depend on each other. If you include the header at the top of the .cpp file, you have effectively forward-declared every single function and so you don't have to worry about making sure your function definitions are in a certain order.
I also find that it clearly shows that these function definitions correspond to those declarations. This makes it easier for the reader to find how translation units depend on each other. Of course, the names of the files might be sufficient, but some more complex projects do not have one-to-one relationship between .cpp files and .h files. Sometimes headers are broken up into multiple parts, or many implementation files will have their external functions declared in a single header (common for large modules).
Really, all inclusions are unnecessary. You can always, after all, just duplicate the declarations (or definitions, in the case of classes) across all of the files that require them. We use the preprocessor to simplify this task and reduce the amount of redundant code. It's easier to stick to a pattern of always including the corresponding header because it will always work, rather than have to check each file every time you edit them and determine if the inclusion is necessary or not.
The way the C language (and C++) is designed is that the compiler processes each .c file in isolation.
You typically launch your compiler (cl.exe or gcc, for example) for one of your c files, and this produces one object file (.o or .obj).
Once all your object files have been generated, you run the linker, passing it all the object files, and it will tie them together into an executable.
That's why every .c file needs to include the headers it depends on. When the compiler is processing it, it knows nothing about which other .c files you may have. All it knows is the contents of the .c file you point it to, as well as the headers it includes.
In your simplified example inclusion of "bar.h" in "bar.c" is not necessary. But in real world in most cases it would be. If you have a class declaration in "bar.h", and "bar.c" has functions of this class, the inclusion is needed. If you have any other declaration which is used in "bar.c" - being it a constant, enum, etc. - again include is needed. Because in real world it is nearly always needed, the easy rule is - include the header file in the corresponding source file always.
If the header only declares global functions, and the source file only implements them (without calling any of them) then it's not strictly necessary. But that's not usually the case; in a large program, you rarely want global functions.
If the header defines a class, then you'll need to include it in the source file in order to define member functions:
void Thing::function(int x) {
//^^^^^^^ needs class definition
}
If the header declares functions in a namespace, then it's a good idea to put the definitions outside the namespace:
void ns::function(int x) {
//^^^^ needs previous declaration
}
This will give a nice compile-time error if the parameter types don't match a previous declaration - for which you'd need to include the header. Defining the function inside its namespace
namespace ns {
void function(int x) {
// ...
}
}
will silently declare a new overload if you get the parameter types wrong.
Simple rule is this(Considering foo is a member function of some class):-
So, if some header file is declaring a function say:=
//foo.h
void foo (int x);
Compiler would need to see this declaration anywhere you have defined this function ( to make sure your definition is in line with declaration) and you are calling this function ( to make sure you have called the function with correct number and type of arguments).
That means you have to include foo.h everywhere you are making call to that function and where you are providing definition for that function.
Also if foo is a global function ( not inside any namespace ) then there is no need to include that foo.h in implementation file.
Thinking Time - Why do you want to split your file anyway?
As the title suggests, the end problem I have is multiple definition linker errors. I have actually fixed the problem, but I haven't fixed the problem in the correct way. Before starting I want to discuss the reasons for splitting a class file into multiple files. I have tried to put all the possible scenarios here - if I missed any, please remind me and I can make changes. Hopefully the following are correct:
Reason 1 To save space:
You have a file containing the declaration of a class with all class members. You place #include guards around this file (or #pragma once) to ensure no conflicts arise if you #include the file in two different header files which are then included in a source file. You compile a separate source file with the implementation of any methods declared in this class, as it offloads many lines of code from your source file, which cleans things up a bit and introduces some order to your program.
Example: As you can see, the below example could be improved by splitting the implementation of the class methods into a different file. (A .cpp file)
// my_class.hpp
#pragma once
class my_class
{
public:
void my_function()
{
// LOTS OF CODE
// CONFUSING TO DEBUG
// LOTS OF CODE
// DISORGANIZED AND DISTRACTING
// LOTS OF CODE
// LOOKS HORRIBLE
// LOTS OF CODE
// VERY MESSY
// LOTS OF CODE
}
// MANY OTHER METHODS
// MEANS VERY LARGE FILE WITH LOTS OF LINES OF CODE
}
Reason 2 To prevent multiple definition linker errors:
Perhaps this is the main reason why you would split implementation from declaration. In the above example, you could move the method body to outside the class. This would make it look much cleaner and structured. However, according to this question, the above example has implicit inline specifiers. Moving the implementation from within the class to outside the class, as in the example below, will cause you linker errors, and so you would either inline everything, or move the function definitions to a .cpp file.
Example: _The example below will cause "multiple definition linker errors" if you do not move the function definition to a .cpp file or specify the function as inline.
// my_class.hpp
void my_class::my_function()
{
// ERROR! MULTIPLE DEFINITION OF my_class::my_function
// This error only occurs if you #include the file containing this code
// in two or more separate source (compiled, .cpp) files.
}
To fix the problem:
//my_class.cpp
void my_class::my_function()
{
// Now in a .cpp file, so no multiple definition error
}
Or:
// my_class.hpp
inline void my_class::my_function()
{
// Specified function as inline, so okay - note: back in header file!
// The very first example has an implicit `inline` specifier
}
Reason 3 You want to save space, again, but this time you are working with a template class:
If we are working with template classes, then we cannot move the implementation to a source file (.cpp file). That's not currently allowed by (I assume) either the standard or by current compilers. Unlike the first example of Reason 2, above, we are allowed to place the implementation in the header file. According to this question the reason is that template class methods also have implied inline specifiers. Is that correct? (It seems to make sense.) But nobody seemed to know on the question I have just referenced!
So, are the two examples below identical?
// some_header_file.hpp
#pragma once
// template class declaration goes here
class some_class
{
// Some code
};
// Example 1: NO INLINE SPECIFIER
template<typename T>
void some_class::class_method()
{
// Some code
}
// Example 2: INLINE specifier used
template<typename T>
inline void some_class::class_method()
{
// Some code
}
If you have a template class header file, which is becoming huge due to all the functions you have, then I believe you are allowed to move the function definitions to another header file (usually a .tpp file?) and then #include file.tpp at the end of your header file containing the class declaration. You must NOT include this file anywhere else, however, hence the .tpp rather than .hpp.
I assume you could also do this with the inline methods of a regular class? Is that allowed also?
Question Time
So I have made some statements above, most of which relate to the structuring of source files. I think everything I said was correct, because I did some basic research and "found out some stuff", but this is a question and so I don't know for sure.
What this boils down to, is how you would organize code within files. I think I have figured out a structure which will always work.
Here is what I have come up with. (This is my class code file organization/structure standard, if you like. Don't know if it will be very useful yet, that's the point of asking.)
1: Declare the class (template or otherwise) in a .hpp file, including all methods, friend functions and data.
2: At the bottom of the .hpp file, #include a .tpp file containing the implementation of any inline methods. Create the .tpp file and ensure all methods are specified to be inline.
3: All other members (non-inline functions, friend functions and static data) should be defined in a .cpp file, which #includes the .hpp file at the top to prevent errors like "class ABC has not been declared". Since everything in this file will have external linkage, the program will link correctly.
Do standards like this exist in industry? Will the standard I came up with work in all cases?
Your three points sound about right. That's the standard way to do things (although I've not seen .tpp extension before, usually it's .inl), although personally I just put inline functions at the bottom of header files rather than in a separate file.
Here is how I arrange my files. I omit the forward declare file for simple classes.
myclass-fwd.h
#pragma once
namespace NS
{
class MyClass;
}
myclass.h
#pragma once
#include "headers-needed-by-header"
#include "myclass-fwd.h"
namespace NS
{
class MyClass
{
..
};
}
myclass.cpp
#include "headers-needed-by-source"
#include "myclass.h"
namespace
{
void LocalFunc();
}
NS::MyClass::...
Replace pragma with header guards according to preference..
The reason for this approach is to reduce header dependencies, which slow down compile times in large projects. If you didn't know, you can forward declare a class to use as a pointer or reference. The full declaration is only needed when you construct, create or use members of the class.
This means another class which uses the class (takes parameters by pointer/reference) only has to include the fwd header in its own header. The full header is then included in the second class's source file. This greatly reduces the amount of unneeded rubbish you get when pulling in a big header, which pulls in another big header, which pulls in another...
The next tip is the unnamed namespace (sometimes called anonymous namespace). This can only appear in a source file and it is like a hidden namespace only visible to that file. You can place local functions, classes etc here which are only used by the the source file. This prevents name clashes if you create something with the same name in two different files. (Two local function F for example, may give linker errors).
The main reason to separate interface from implementation is so that you don't have to recompile all of your code when something in the implementation changes; you only have to recompile the source files that changed.
As for "Declare the class (template or otherwise)", a template is not a class. A template is a pattern for creating classes. More important, though, you define a class or a template in a header. The class definition includes declarations of its member functions, and non-inine member functions are defined in one or more source files. Inline member functions and all template functions should be defined in the header, by whatever combination of direct definitions and #include directives you prefer.
Do standards like this exist in industry?
Yes. Then again, coding standards that are rather different from the ones you expressed can also be found in industry. You are talking about coding standards, after all, and coding standards range from good to bad to ugly.
Will the standard I came up with work in all cases?
Absolutely not. For example,
template <typename T> class Foo {
public:
void some_method (T& arg);
...
};
Here, the definition of class template Foo doesn't know a thing about that template parameter T. What if, for some class template, the definitions of the methods vary depending on the template parameters? Your rule #2 just doesn't work here.
Another example: What if the corresponding source file is huge, a thousand lines long or longer? At times it makes sense to provide the implementation in multiple source files. Some standards go to the extreme of dictating one function per file (personal opinion: Yech!).
At the other extreme of a thousand-plus line long source file is a class that has no source files. The entire implementation is in the header. There's a lot to be said for header-only implementations. If nothing else, it simplifies, sometimes significantly, the linking problem.
I'm currently writing a program, and couldn't figure out why I got an error (note: I already fixed it, I'm curious about WHY the error was there and what this implies about including .h files).
Basically, my program was structured as follows:
The current file I'm working with, I'll call Current.cc (which is an implementation of Current.h).
Current.cc included a header file, named CalledByCurrent.h (which has an associated implementation called CalledByCurrent.cc). CalledByCurrent.h contains a class definition.
There was a non-class function defined in CalledByCurrent.cc called thisFunction(). thisFunction() was not declared in CalledByCurrent.h since it was not actually a member function of the class (just a little helper function). In Current.cc, I needed to use this function, so I just redefined thisFunction() at the top of Current.cc. However, when I did this, I got an error saying that the function was duplicated. Why is this, when myFunction() wasn't even declared in CalledByCurrent.h?
Thus, I just removed the function from Current.cc, now assuming that Current.cc had access to thisFunction() from CalledByCurrent.cc. However, when I did this, I found that Current.cc did not know what function I was talking about. What the heck? I then copied the function definition for thisFunction() to the top of my CalledByCurrent.h file and this resolved the problem. Could you help me understand this behavior? Particularly, why would it think there was a duplicate, yet it didn't know how to use the original?
p.s - I apologize for how confusing this post is. Please let me know if there's anything I can clear up.
You are getting multiple definitions from the linker - it sees two functions with the same name and complains. For example:
// a.cpp
void f() {}
// b.cpp
void f() {}
then
g++ a.cpp b.cpp
gives:
C:\Users\neilb\Temp\ccZU9pkv.o:b.cpp:(.text+0x0): multiple definition of `f()'
The way round this is to either put the definition in only one .cpp file, or to declare one or both of the functions as static:
// b.cpp
static void f() {}
You can't have two global functions with the same name (even in 2 different translation units). To avoid getting the linker error define the function as static so that it is not visible outside the translation unit.
EDIT
You can use the function in the other .cpp file by using extern keyword. See this example:
//Test.cpp
void myfunc()
{
}
//Main.cpp
extern void myfunc();
int main()
{
myfunc();
}
It will call myfunc() defined in test.cpp.
The header file inclusion mechanism should be tolerant to duplicate header file inclusions.
That's because whenever you simply declare a function it's considered in extern (global) scope (whether you declare it in a header file or not). Linker will have multiple implementation for the same function signature.
If those functions are truely helper functions then, declare them as;
static void thisFunction();
Other way, if you are using the same function as helper then, simply declare it in a common header file, say:
//CalledByCurrent.h (is included in both .cc files)
void thisFunction();
And implement thisFunction() in either of the .cc files. This should solve the problem properly.
Here are some ideas:
You didn't put a header include guard in your header file. If it's being included twice, you might get this sort of error.
The function's prototype (at the top) doesn't match its signature 100%.
You put the body of the function in the header file.
You have two functions of the same signature in two different source files, but they aren't marked static.
If you are using gcc (you didn't say what compiler you're using), you can use the -E switch to view the preprocessor output. This includes expanding all #defines and including all #includes.
Each time something is expanded, it tells you what file and line it was in. Using this you can see where thisFunction() is defined.
There are 2 distinct errors coming from 2 different phases of the build.
In the first case where you have a duplicate, the COMPILER is happy, but the LINKER is complaining because when it picks up all the function definitions across the different source files it notices 2 are named the same. As the other answers state, you can use the static keyword or use a common definition.
In the second case where you see your function not declared in this scope, its because the COMPILER is complaining because each file needs to know about what functions it can use.
Compiling happens before Linking, so the COMPILER cannot know ahead of time whether or not the LINKER will find a matching function, thats why you use declarations to notify the COMPILER that a definition will be found by the LINKER later on.
As you can see, your 2 errors are not contradictory, they are the result of 2 separate processes in the build that have a particular order.
When we design classes in Java, Vala, or C# we put the definition and declaration in the same source file. But in C++ it is traditionally preferred to separate the definition and declaration in two or more files.
What happens if I just use a header file and put everything into it, like Java?
Is there a performance penalty or something?
The answer depends on what kind of class you're creating.
C++'s compilation model dates back to the days of C, and so its method of importing data from one source file into another is comparatively primitive. The #include directive literally copies the contents of the file you're including into the source file, then treats the result as though it was the file you had written all along. You need to be careful about this because of a C++ policy called the one definition rule (ODR) which states, unsurprisingly, that every function and class should have at most one definition. This means that if you declare a class somewhere, all of that class's member functions should be either not defined at all or defined exactly once in exactly one file. There are some exceptions (I'll get to them in a minute), but for now just treat this rule as if it's a hard-and-fast, no-exceptions rule.
If you take a non-template class and put both the class definition and the implementation into a header file, you might run into trouble with the one definition rule. In particular, suppose that I have two different .cpp files that I compile, both of which #include your header containing both the implementation and the interface. In this case, if I try linking those two files together, the linker will find that each one contains a copy of the implementation code for the class's member functions. At this point, the linker will report an error because you have violated the one definition rule: there are two different implementations of all the class's member functions.
To prevent this, C++ programmers typically split classes up into a header file which contains the class declaration, along with the declarations of its member functions, without the implementations of those functions. The implementations are then put into a separate .cpp file which can be compiled and linked separately. This allows your code to avoid running into trouble with the ODR. Here's how. First, whenever you #include the class header file into multiple different .cpp files, each of them just gets a copy of the declarations of the member functions, not their definitions, and so none of your class's clients will end up with the definitions. This means that any number of clients can #include your header file without running into trouble at link-time. Since your own .cpp file with the implementation is the sole file that contains the implementations of the member functions, at link time you can merge it with any number of other client object files without a hassle. This is the main reason that you split the .h and .cpp files apart.
Of course, the ODR has a few exceptions. The first of these comes up with template functions and classes. The ODR explicitly states that you can have multiple different definitions for the same template class or function, provided that they're all equivalent. This is primarily to make it easier to compile templates - each C++ file can instantiate the same template without colliding with any other files. For this reason, and a few other technical reasons, class templates tend to just have a .h file without a matching .cpp file. Any number of clients can #include the file without trouble.
The other major exception to the ODR involves inline functions. The spec specifically states that the ODR does not apply to inline functions, so if you have a header file with an implementation of a class member function that's marked inline, that's perfectly fine. Any number of files can #include this file without breaking the ODR. Interestingly, any member function that's declared and defined in the body of a class is implicitly inline, so if you have a header like this:
#ifndef Include_Guard
#define Include_Guard
class MyClass {
public:
void DoSomething() {
/* ... code goes here ... */
}
};
#endif
Then you're not risking breaking the ODR. If you rewrite this as
#ifndef Include_Guard
#define Include_Guard
class MyClass {
public:
void DoSomething();
};
void MyClass::DoSomething() {
/* ... code goes here ... */
}
#endif
then you would be breaking the ODR, since the member function isn't marked inline and if multiple clients #include this file there will be multiple definitions of MyClass::DoSomething.
So to summarize - you should probably split up your classes into a .h/.cpp pair to avoid breaking the ODR. However, if you're writing a class template, you don't need the .cpp file (and probably shouldn't have one at all), and if you're okay marking every single member function of your class inline you can also avoid the .cpp file.
The drawback of putting definition in header files is as follows:-
Header file A - contains definition of metahodA()
Header file B - includes header file A.
Now let us say you change the definition of methodA. You would need to compile file A as well as B because of the inclusion of header file A in B.
The biggest difference is that every function is declared as an inline function. Generally your compiler will be smart enough that this won't be a problem, but worst case scenario it will cause page faults on a regular basis and make your code embarrassingly slow. Usually the code is separated for design reasons, and not for performance.
In general, it is a good practice to seperate implementation from headers. However, there are exceptions in cases like templates where the implementation goes in the header itself.
Two particular problems with putting everything in the header:
Compile times will be increased, sometimes greatly. C++ compile times are long enough that that's not something you want.
If you have circular dependencies in the implementation, keeping everything in headers is difficult to impossible. eg:
header1.h
struct C1
{
void f();
void g();
};
header2.h
struct C2
{
void f();
void g();
};
impl1.cpp
#include "header1.h"
#include "header2.h"
void C1::f()
{
C2 c2;
c2.f();
}
impl2.cpp
#include "header2.h"
#include "header1.h"
void C2::g()
{
C1 c1;
c1.g();
}
While I was reading the accepted answer of this question, I had the following question:
Typically, methods are defined in header files (.hpp or whatever), and implementation in source files (.cpp or whatever).
One of the main reasons it is bad practice to ever include a "source file" (#include <source_file.cpp>) is that its methods implementation would then be duplicated, resulting in linking errors.
When one writes:
#ifndef BRITNEYSPEARS_HPP
#define BRITNEYSPEARS_HPP
class BritneySpears
{
public:
BritneySpears() {}; // Here the constructor has implementation.
};
#endif /* BRITNEYSPEARS_HPP */
He is giving the implementation of the constructor (here an "empty" implementation, but still).
But why then including this header file multiple times (aka. on different source files) will not generate a "duplicate definition" error at link time ?
Inline functions are exceptions to the "one definition rule": you are allowed to have identical implementations of them in more than one compilation unit. Functions are inline if they are declared inline or implemented inside a class definition.
Member functions with implementation inside the class definition are treated as inline functions. Inline functions are exempt from one definition rule.
Specifically when the linker sees two inline functions with the same signature it treats them as if it is the same function and just picks one of them. This can lead to really weird hard to detect problems.
Because it is an "inline" function. Inline functions can be included from headers as many times as you like and they don't cause duplicate definition linker errors.
The compiler will also try to bring them inline so, in your example above, the compiler will try and eliminate the call to the constructor completely.