So I have made a complex project and now I have too many include files causing me headaches. How can I best manage these classes? Some classes need to use other classes. I also have a .h file containing a bunch of arrays of int. These stay the same through the application but I get the problem when the compiler complains that I am redefining the array.
Should I make a class library? Namespace? DLL? What is the best practice and where can I find out how to do the right one?
Use include guards in all your headers.
file.h
#ifndef FILE_H_INCLUDED
#define FILE_H_INCLUDED
void foo();
#endif
Avoid global variables when possible. If you must use them, declare global variables using extern and place the definition in a .cpp file instead.
file.h
extern int var[20];
file.cpp
int var[20];
When possible, use forward declarations. You can use forward declarations whenever you use only a reference or a pointer to a class and don't dereference that pointer.
useful.h
class Useful {};
other.h
// Forward-declare instead of #include
class Useful;
class Other
{
Useful* helper;
};
I don't think there really is a best practice, it depends on the situation. Something I might recommend is to group like objects into a namespace then put all of the definitions in a single .h file. If the implementations are short, put them all in a single cpp file. Here at my work we have a database access layer like this. There are roughly a couple dozen objects that are populated by stored procs. The code is still a major pain in the ass but it's better than having two dozen .h and cpp files that are all less than 500 lines. If you do this comments to compartmentalize object definitions become really important. You can easily get files longer than 10,000 lines so you need something to break them up.
Of course use include guards, they'll likely solve the redefining error.
You need to know the difference between a definition and a declaration, and what uses of an entity required the entity to be declared. Then you also need to learn the 'one definition rule' (ODR) which tells you when when you're not allowed to have more than one definition in the program (and therefore the definition can't go in a header) and what things can be defined more than once as long as the definitions are identical (and therefore the definition can go in a header).
For example, those arrays you're declaring; since these are globally visible arrays the program can only contain one definition, and therefore the definition can't go in a header. Every part of the program that needs to access them simply needs to know their declaration. So instead of putting a definition in a header file and violating the ODR, you should have a C++ file that contains their definition and a header that contains declarations for them.
Code like this:
int foo[100];
both declares and defines the array foo. Put code like this in a C++ file. To only declare this array you do this:
extern int foo[100];
Put code like this in a header.
Class definitions, inline functions, and templates are all things that can be defined multiple times as long as the definitions are identical. You can put these definitions into headers, whereas regular functions, and global variables may only be defined once, so you declare them in headers and then define them in implementation files.
Related
I have a project with multiple header files and .cpp files.
All of the header files have include guards.
There is a file called Constants.h where I define some constants. Some of these with defines, some as constant variables.
There are more header-.cpp-file pairs with code in them. One of these does contain a class, the others don't.
When I include my files into my main file (an arduino sketch), I get a lot of linker errors, claiming there are multiple definitions of some variables.
I read that this mainly occurs when you include .c or .cpp files, which I don't do. All the .cpp files only include their appropriate header files.
I did manage to find multiple solution proposals:
1) inline:
With functions, inline can be used to get rid of this problem. However, this is not possible with variables.
2) anonymous namespace:
This is one of the solutions I used. I put anonymous namespaces around all the problematic definitions I had. It did work, however I do not understand why this works. Could anyone help me understand it?
3) moving definitions into .cpp files:
This is another approach I used sometimes, but it wasn't always possible since I needed some of my definitions in other code, not belonging to this header file or its code (which I do admit is bad design).
Could anyone explain to me where exactly the problem lies and why these approaches work?
Some of these with defines, some as constant variables.
In C const does not imply the same thing as it does in C++. If you have this:
const int foo = 3;
In a header, then any C++ translation unit that includes the header will have a static variable named foo (the const at namespace scope implies internal linkage). Moreover, foo can even be considered a constant expression by many C++ constructs.
Such is not the case in C. There foo is an object at file scope with external linkage. So you will have multiple definitions from C translation units.
A quick fix would be to alter the definitions into something like this:
static const int foo = 3;
This is redundant in C++ but required in C.
In addition to Story Teller's excellent explanation, to define global variables, use the following:
// module.h
#include "glo.h"
// glo.h
#ifndef EXTERN
# define EXTERN extern
#endif
EXTERN int myvar;
// main.c
#define EXTERN
#include "glo.h"
In main.c all variables will be declared (i.e. space is allocated for them), in all other c files that include glo.h, all variables will be known.
You shouldn't declare any object in header files, this should be moved to c\c++ files.
In header you may:
declare types such as: classes, structs, typedefs etc.
put forward declarations of (not classes) functions
put inline (or in classes) functions (+ body)
you may add extern declaration.
you may put your macros.
a static declaration may declare things multiple times, therefore it is not recommended.
I have what seems a relatively simple question, but one that keeps defying my efforts to understand it.
I apologise if it is a simple question, but like many simple questions, I can't seem to find a solid explanation anywhere.
With the below code:
/*foo.c*/
#include "bar.h"
int main() {
return(my_function(1,2));
}
/*bar.h*/
int my_function(int,int);
/*bar.c*/
#include "bar.h" /*is this necessary!?*/
int my_function(int x, int y) {
return(x+y);
}
Simply, is the second inclusion necessary? I don't understand why I keep seeing headers included in both source files. Surely if the function is declared in "foo.c" by including "bar.h," it does not need to be declared a second time in another linked source file (especially the one which actually defines it)??? A friend tried to explain to me that it didn't really matter for functions, but it did for structs, something which still eludes me! Help!
Is it simply for clarity, so that programmers can see which functions are being used externally?
I just don't get it!
Thanks!
In this particular case, it's unnecessary for the reason you described. It might be useful in situations where you have a more complex set of functions that might all depend on each other. If you include the header at the top of the .cpp file, you have effectively forward-declared every single function and so you don't have to worry about making sure your function definitions are in a certain order.
I also find that it clearly shows that these function definitions correspond to those declarations. This makes it easier for the reader to find how translation units depend on each other. Of course, the names of the files might be sufficient, but some more complex projects do not have one-to-one relationship between .cpp files and .h files. Sometimes headers are broken up into multiple parts, or many implementation files will have their external functions declared in a single header (common for large modules).
Really, all inclusions are unnecessary. You can always, after all, just duplicate the declarations (or definitions, in the case of classes) across all of the files that require them. We use the preprocessor to simplify this task and reduce the amount of redundant code. It's easier to stick to a pattern of always including the corresponding header because it will always work, rather than have to check each file every time you edit them and determine if the inclusion is necessary or not.
The way the C language (and C++) is designed is that the compiler processes each .c file in isolation.
You typically launch your compiler (cl.exe or gcc, for example) for one of your c files, and this produces one object file (.o or .obj).
Once all your object files have been generated, you run the linker, passing it all the object files, and it will tie them together into an executable.
That's why every .c file needs to include the headers it depends on. When the compiler is processing it, it knows nothing about which other .c files you may have. All it knows is the contents of the .c file you point it to, as well as the headers it includes.
In your simplified example inclusion of "bar.h" in "bar.c" is not necessary. But in real world in most cases it would be. If you have a class declaration in "bar.h", and "bar.c" has functions of this class, the inclusion is needed. If you have any other declaration which is used in "bar.c" - being it a constant, enum, etc. - again include is needed. Because in real world it is nearly always needed, the easy rule is - include the header file in the corresponding source file always.
If the header only declares global functions, and the source file only implements them (without calling any of them) then it's not strictly necessary. But that's not usually the case; in a large program, you rarely want global functions.
If the header defines a class, then you'll need to include it in the source file in order to define member functions:
void Thing::function(int x) {
//^^^^^^^ needs class definition
}
If the header declares functions in a namespace, then it's a good idea to put the definitions outside the namespace:
void ns::function(int x) {
//^^^^ needs previous declaration
}
This will give a nice compile-time error if the parameter types don't match a previous declaration - for which you'd need to include the header. Defining the function inside its namespace
namespace ns {
void function(int x) {
// ...
}
}
will silently declare a new overload if you get the parameter types wrong.
Simple rule is this(Considering foo is a member function of some class):-
So, if some header file is declaring a function say:=
//foo.h
void foo (int x);
Compiler would need to see this declaration anywhere you have defined this function ( to make sure your definition is in line with declaration) and you are calling this function ( to make sure you have called the function with correct number and type of arguments).
That means you have to include foo.h everywhere you are making call to that function and where you are providing definition for that function.
Also if foo is a global function ( not inside any namespace ) then there is no need to include that foo.h in implementation file.
When we design classes in Java, Vala, or C# we put the definition and declaration in the same source file. But in C++ it is traditionally preferred to separate the definition and declaration in two or more files.
What happens if I just use a header file and put everything into it, like Java?
Is there a performance penalty or something?
The answer depends on what kind of class you're creating.
C++'s compilation model dates back to the days of C, and so its method of importing data from one source file into another is comparatively primitive. The #include directive literally copies the contents of the file you're including into the source file, then treats the result as though it was the file you had written all along. You need to be careful about this because of a C++ policy called the one definition rule (ODR) which states, unsurprisingly, that every function and class should have at most one definition. This means that if you declare a class somewhere, all of that class's member functions should be either not defined at all or defined exactly once in exactly one file. There are some exceptions (I'll get to them in a minute), but for now just treat this rule as if it's a hard-and-fast, no-exceptions rule.
If you take a non-template class and put both the class definition and the implementation into a header file, you might run into trouble with the one definition rule. In particular, suppose that I have two different .cpp files that I compile, both of which #include your header containing both the implementation and the interface. In this case, if I try linking those two files together, the linker will find that each one contains a copy of the implementation code for the class's member functions. At this point, the linker will report an error because you have violated the one definition rule: there are two different implementations of all the class's member functions.
To prevent this, C++ programmers typically split classes up into a header file which contains the class declaration, along with the declarations of its member functions, without the implementations of those functions. The implementations are then put into a separate .cpp file which can be compiled and linked separately. This allows your code to avoid running into trouble with the ODR. Here's how. First, whenever you #include the class header file into multiple different .cpp files, each of them just gets a copy of the declarations of the member functions, not their definitions, and so none of your class's clients will end up with the definitions. This means that any number of clients can #include your header file without running into trouble at link-time. Since your own .cpp file with the implementation is the sole file that contains the implementations of the member functions, at link time you can merge it with any number of other client object files without a hassle. This is the main reason that you split the .h and .cpp files apart.
Of course, the ODR has a few exceptions. The first of these comes up with template functions and classes. The ODR explicitly states that you can have multiple different definitions for the same template class or function, provided that they're all equivalent. This is primarily to make it easier to compile templates - each C++ file can instantiate the same template without colliding with any other files. For this reason, and a few other technical reasons, class templates tend to just have a .h file without a matching .cpp file. Any number of clients can #include the file without trouble.
The other major exception to the ODR involves inline functions. The spec specifically states that the ODR does not apply to inline functions, so if you have a header file with an implementation of a class member function that's marked inline, that's perfectly fine. Any number of files can #include this file without breaking the ODR. Interestingly, any member function that's declared and defined in the body of a class is implicitly inline, so if you have a header like this:
#ifndef Include_Guard
#define Include_Guard
class MyClass {
public:
void DoSomething() {
/* ... code goes here ... */
}
};
#endif
Then you're not risking breaking the ODR. If you rewrite this as
#ifndef Include_Guard
#define Include_Guard
class MyClass {
public:
void DoSomething();
};
void MyClass::DoSomething() {
/* ... code goes here ... */
}
#endif
then you would be breaking the ODR, since the member function isn't marked inline and if multiple clients #include this file there will be multiple definitions of MyClass::DoSomething.
So to summarize - you should probably split up your classes into a .h/.cpp pair to avoid breaking the ODR. However, if you're writing a class template, you don't need the .cpp file (and probably shouldn't have one at all), and if you're okay marking every single member function of your class inline you can also avoid the .cpp file.
The drawback of putting definition in header files is as follows:-
Header file A - contains definition of metahodA()
Header file B - includes header file A.
Now let us say you change the definition of methodA. You would need to compile file A as well as B because of the inclusion of header file A in B.
The biggest difference is that every function is declared as an inline function. Generally your compiler will be smart enough that this won't be a problem, but worst case scenario it will cause page faults on a regular basis and make your code embarrassingly slow. Usually the code is separated for design reasons, and not for performance.
In general, it is a good practice to seperate implementation from headers. However, there are exceptions in cases like templates where the implementation goes in the header itself.
Two particular problems with putting everything in the header:
Compile times will be increased, sometimes greatly. C++ compile times are long enough that that's not something you want.
If you have circular dependencies in the implementation, keeping everything in headers is difficult to impossible. eg:
header1.h
struct C1
{
void f();
void g();
};
header2.h
struct C2
{
void f();
void g();
};
impl1.cpp
#include "header1.h"
#include "header2.h"
void C1::f()
{
C2 c2;
c2.f();
}
impl2.cpp
#include "header2.h"
#include "header1.h"
void C2::g()
{
C1 c1;
c1.g();
}
http://www.learncpp.com/cpp-tutorial/19-header-files/
It mentions the following as another solution to "forward declaration":
A header file only has to be written once, and it can be included in as many files as needed. This also helps with maintenance by minimizing the number of changes that need to be made if a function prototype ever changes (eg. by adding a new parameter).
But, cannot this also be made with "forward declaration"? Since we are defining the function int add(int x, int y) for example in "add.cpp", and using this function in "main.cpp" by typing:
int add(int x, int y);
?
Thanks.
That is certainly possible. But for a realistically-sized program, there will be a large number of functions that a large number of other files will need to declare. If you put a forward declaration in every file that needs to access another function, you have a multitude of problems:
You've just copy-pasted the same declaration into many different files. If you ever change the function signature, you have to change every place you've pasted its forward declaration.
The forward declaration itself does not naturally tell you what file the actual function is defined in. If you use a sane method of organizing your header files and your source files (for instance, every function defined in a .cpp file is declared in a .h file with the same name), then the place that the function is defined is implied by the place that it is declared.
Your code will be less readable to other programmers, who are very used to using header files for everything (for good reason), even if all you need from a header is one specific function and you could easily forward-declare it yourself.
Header files contain forward declarations - that's what they do. The issue they resolve is when you have a more complex project with multiple source code files.
You could have a library of functions, e.g. matrix.c for matrix operations. Without header files you would have to copy the forward declarations for all the matrix.c functions into all the other source files. You would also have to keep all those copies up to date with any changes to matrix.c.
If you ever change the function in matrix.c, but forget to change its declaration in another file you will not get a compile error. You will probably not get a linker error either. All you will get is a crash or other random behaviour once you run your program.
Having the declarations in a single file, typically matrix.h, that will be used everywhere else removes all these issues.
You can use forward declaration but it doesn't scale well and it's unwieldly if you're using somebody else's code or library.
In general, the header file defines the interface to the code.
Also, think what happens if the function requires some user defined type. Are you going to forward declare that too? That type may regularly change its implementation (keeping it's public interface the same) which would result in having to regularly change all the forward declarations.
The header file solution is far more maintainable (less error prone) and make it far easier to determine exactly what code is being used.
I C and C++ one essentially put all the forward and or external declarations into the header. This then provides a convenient way of including them in the various source files without having to manually include them.
In your case, if you have add defined in add.cpp, you can just provide the external declaration in main.cpp and everything is cool. The header file is there to help you when you have a large number of files that need add declared and don't want to do so for each one.
int add(int x, int y); // forward declaration using function prototype
Can you explain "forward declaration"
more further? What is the problem if
we use it in the main() function?
It's same as #include"add.h". If you know,preprocessor expands the file which you mention in #include, in the .cpp file where you write the #include directive. That means, if you write #include"add.h", you get the same thing, it is as if you doing "forward declaration".
I'm assuming that add.h has this line:
int add(int x, int y);
What are forward declarations in C++?
When dividing your code up into multiple files just what exactly should go into an .h file and what should go into a .cpp file?
Header files (.h) are designed to provide the information that will be needed in multiple files. Things like class declarations, function prototypes, and enumerations typically go in header files. In a word, "definitions".
Code files (.cpp) are designed to provide the implementation information that only needs to be known in one file. In general, function bodies, and internal variables that should/will never be accessed by other modules, are what belong in .cpp files. In a word, "implementations".
The simplest question to ask yourself to determine what belongs where is "if I change this, will I have to change code in other files to make things compile again?" If the answer is "yes" it probably belongs in the header file; if the answer is "no" it probably belongs in the code file.
Fact is, in C++, this is somewhat more complicated that the C header/source organization.
What does the compiler see?
The compiler sees one big source (.cpp) file with its headers properly included. The source file is the compilation unit that will be compiled into an object file.
So, why are headers necessary?
Because one compilation unit could need information about an implementation in another compilation unit. So one can write for example the implementation of a function in one source, and write the declaration of this function in another source needing to use it.
In this case, there are two copies of the same information. Which is evil...
The solution is to share some details. While the implementation should remain in the Source, the declaration of shared symbols, like functions, or definition of structures, classes, enums, etc., could need to be shared.
Headers are used to put those shared details.
Move to the header the declarations of what need to be shared between multiple sources
Nothing more?
In C++, there are some other things that could be put in the header because, they need, too, be shared:
inline code
templates
constants (usually those you want to use inside switches...)
Move to the header EVERYTHING what need to be shared, including shared implementations
Does it then mean that there could be sources inside the headers?
Yes. In fact, there are a lot of different things that could be inside a "header" (i.e. shared between sources).
Forward declarations
declarations/definition of functions/structs/classes/templates
implementation of inline and templated code
It becomes complicated, and in some cases (circular dependencies between symbols), impossible to keep it in one header.
Headers can be broken down into three parts
This means that, in an extreme case, you could have:
a forward declaration header
a declaration/definition header
an implementation header
an implementation source
Let's imagine we have a templated MyObject. We could have:
// - - - - MyObject_forward.hpp - - - -
// This header is included by the code which need to know MyObject
// does exist, but nothing more.
template<typename T>
class MyObject ;
.
// - - - - MyObject_declaration.hpp - - - -
// This header is included by the code which need to know how
// MyObject is defined, but nothing more.
#include <MyObject_forward.hpp>
template<typename T>
class MyObject
{
public :
MyObject() ;
// Etc.
} ;
void doSomething() ;
.
// - - - - MyObject_implementation.hpp - - - -
// This header is included by the code which need to see
// the implementation of the methods/functions of MyObject,
// but nothing more.
#include <MyObject_declaration.hpp>
template<typename T>
MyObject<T>::MyObject()
{
doSomething() ;
}
// etc.
.
// - - - - MyObject_source.cpp - - - -
// This source will have implementation that does not need to
// be shared, which, for templated code, usually means nothing...
#include <MyObject_implementation.hpp>
void doSomething()
{
// etc.
} ;
// etc.
Wow!
In the "real life", it is usually less complicated. Most code will have only a simple header/source organisation, with some inlined code in the source.
But in other cases (templated objects knowing each others), I had to have for each object separate declaration and implementation headers, with an empty source including those headers just to help me see some compilation errors.
Another reason to break down headers into separate headers could be to speed up the compilation, limiting the quantity of symbols parsed to the strict necessary, and avoiding unecessary recompilation of a source who cares only for the forward declaration when an inline method implementation changed.
Conclusion
You should make your code organization both as simple as possible, and as modular as possible. Put as much as possible in the source file. Only expose in headers what needs to be shared.
But the day you'll have circular dependancies between templated objects, don't be surprised if your code organization becomes somewhat more "interesting" that the plain header/source organization...
^_^
in addition to all other answers, i will tell you what you DON'T place in a header file:
using declaration (the most common being using namespace std;) should not appear in a header file because they pollute the namespace of the source file in which it is included.
What compiles into nothing (zero binary footprint) goes into header file.
Variables do not compile into nothing, but type declarations do (coz they only describe how variables behave).
functions do not, but inline functions do (or macros), because they produce code only where called.
templates are not code, they are only a recipe for creating code. so they also go in h files.
In general, you put declarations in the header file and definitions in the implementation (.cpp) file. The exception to this is templates, where the definition must also go in the header.
This question and ones similar to it has been asked frequently on SO - see Why have header files and .cpp files in C++? and C++ Header Files, Code Separation for example.
Mainly header file contain class skeleton or declaration (does not change frequently)
and cpp file contains class implementation (changes frequently).
Header (.h)
Macros and includes needed for the interfaces (as few as possible)
The declaration of the functions and classes
Documentation of the interface
Declaration of inline functions/methods, if any
extern to global variables (if any)
Body (.cpp)
Rest of macros and includes
Include the header of the module
Definition of functions and methods
Global variables (if any)
As a rule of thumb, you put the "shared" part of the module on the .h (the part that other modules needs to be able to see) and the "not shared" part on the .cpp
PD: Yes, I've included global variables. I've used them some times and it's important not to define them on the headers, or you'll get a lot of modules, each defining its own variable.
Your class and function declarations plus the documentation, and the definitions for inline functions/methods (although some prefer to put them in separate .inl files).
the header file (.h) should be for declarations of classes, structs and its methods, prototypes, etc. The implementation of those objects are made in cpp.
in .h
class Foo {
int j;
Foo();
Foo(int)
void DoSomething();
}
I'd expect to see:
declarations
comments
definitions marked inline
templates
the really answer though is what not to put in:
definitons (can lead to things being multiply defined)
using declarations/directives (forces them on anyone including your header, can cause nameclashes)
The header Defines something but doesn't tell anything about the implementation. ( Excluding Templates in this "metafore".
With that said, you need to divide "definitions" into sub-groups, there are, in this case, two types of definitions.
You define the "layout" of your strucutre, telling only as much as is needed by the surrounding usage groups.
The definitions of a variable, function and a class.
Now, I am of course talking about the first subgroup.
The header is there to define the layout of your structure in order to help the rest of the software use the implementation. You might want to see it as an "abstraction" of your implementation, which is vaughly said but, I think it suits quite well in this case.
As previous posters have said and shown you declare private and public usage areas and their headers, this also includes private and public variables. Now, I don't want to go into design of the code here but, you might want to consider what you put in your headers, since that is the Layer between the end user and the implementation.
Header files - shouldn't change during development too often -> you should think, and write them at once (in ideal case)
Source files - changes during implementation