How would I handle having a large amount of header files? - c++

As I have worked on my program, I have gotten to a point where I have about 30+ header files.
When I started, I had an all-in-one header that linked to all the header files I made up to date, as well as the libraries I needed for the project.
With this format, I had to take 5+ minutes in order to recompile the ENTIRE program as a whole just by changing a variable name or something equally as small.
My previous plan was to make a "Tier List" of all my headers, where the most basic ones are at the bottom, the next level headers required these, so on and so forth until I got to the main.cpp file. (Tier 2 needed the highest of Tier 1 and so on). After this idea, I eventually deviated into making folder groups, and having a Folder.hpp file for if I needed everything in that directory.
Now when compiling, I am immediately overwhelmed with syntax errors saying that nothing is properly included, when the include command is sitting right in front of my face.
Here are some of the headers, as per one of the syntax errors:
===Character.hpp===
#include "../../Pre.hpp"
#include "../Assets/Folder.hpp"
#include "../Control/Folder.hpp"
#include "../WorldObj.hpp"
===Control/Folder.hpp===
#include "CommonStat.hpp"
#include "DeathHandler.hpp"
#include "EnergySys.hpp"
#include "FactionHandler.hpp"
#include "Interaction.hpp"
#include "Warper.hpp"
#include "WeaponPos.hpp"
===WeaponPos.hpp===
struct WeaponPos{};
===Build Message===
Build Message: error: 'WeaponPos' does not name a type
===NOTE===
The #include lines are the only (seemingly) important portions of the code. the 'struct WeaponPos' is only to show where it is located in terms of the syntax error.
WeaponPos myWpnPos is a variable of Character Forgot to inlcude that note.
What would be a good way to properly and effectively organize the header files so they are all accountable when compiling a source file?
I am using Code::Blocks and MinGW GCC 4.7 on Windows 7 64-bit.

This by no means solves your entire problem, but it is worth knowing that it exists and it may actually help in your future design decisions.
One of the ways of handling long time to recompile caused by a small change is PIMPL idion - general idea is to have a separate struct with all private members of the class. The struct is forward declared in header file and defined in cpp file of the class. Therefore, when you add/rename a field in your private struct, header doesn't change at all, thus other headers linking to it don't have to be recompiled.
Much better description with examples
Another important point is not to create levels of included headers - one should rather include headers only if they are really needed. Good example of this is forward declaration. If you have Class1 as a pointer-member in Class2, you don't have to actually include Class1.h in Class2.h - it is usually enough to forward declare Class1 and then include the header in the Class2.cpp file - which reduces recompilation greatly.
To sum up:
use PIMPL
forward declare and include in cpp when you can
do not include when you don't have to

Related

How do I prevent include breakage

I have a code that contains huge number of cpp / header files. My problem now is, that because many include each other, I occasionally get into a situation that my code doesn't compile, unless I reorder the #include directives in random files, which is now necessary basically with creation of any other header file.
This is really very annoying; is there any tip how should I write my c++ code in order to prevent complications with #include? I would prefer to split my source code to as many files as possible so that cooperation with other programmers (using git or svn) is easier (more files == lower number of edit conflicts).
One of things that help me now is forward declaration, when I declare the classes from other headers into other files. That helps sometimes, but doesn't resolve all issues; sometimes I just need to change order of #includes in random header files or merge multiple files.
Not a panacea, but the following guideline helps me a lot.
Assuming your code is composed of files like MyClassXyz.cpp with corresponding MyClassXyz.h, one class per source file, every cpp-file should include its corresponding header file first. That is, MyClassXyz.cpp must start with the following line:
// (possibly after comments)
#include "MyClassXyz.h"
This ensures that MyClassXyz.h includes all header files (or forward declarations) necessary for its compilation.
I often see code that uses an opposite convention (#includeing most general header files first), for example, MyClassXyz.cpp starts with
#include <vector>
#include <iosfwd>
#include "blah.h"
#include "mytypes.h"
#include "MyClassXyz.h"
And MyClassXyz.h "goes straight to the point" using stuff defined in the additional headers:
#pragma once
// "#include <vector>" missing - a hidden error!
// "#include <iosfwd>" missing - a hidden error!
class MyClassXyz
{
std::vector<int> v;
friend std::ostream& operator<<(...);
...
}
While this compiles OK, it gives enormous headaches of the type you describe, when trying to use the class MyClassXyz in some other source file.

Best practice for including from include files

I was wondering if there is some pro and contra having include statements directly in the include files as opposed to have them in the source file.
Personally I like to have my includes "clean" so, when I include them in some c/cpp file I don't have to hunt down every possible header required because the include file doesn't take care of it itself. On the other hand, if I have the includes in the include files compile time might get bigger, because even with the include guards, the files have to be parsed first. Is this just a matter of taste, or are there any pros/cons over the other?
What I mean is:
sample.h
#ifdef ...
#include "my_needed_file.h"
#include ...
class myclass
{
}
#endif
sample.c
#include "sample.h"
my code goes here
Versus:
sample.h
#ifdef ...
class myclass
{
}
#endif
sample.c
#include "my_needed_file.h"
#include ...
#include "sample.h"
my code goes here
There's not really any standard best-practice, but for most accounts, you should include what you really need in the header, and forward-declare what you can.
If an implementation file needs something not required by the header explicitly, then that implementation file should include it itself.
The language makes no requirements, but the almost universally
accepted coding rule is that all headers must be self
sufficient; a source file which consists of a single statement
including the include should compile without errors. The usual
way of verifying this is for the implementation file to include
its header before anything else.
And the compiler only has to read each include once. If it
can determine with certainty that it has already read the file,
and on reading it, it detects the include guard pattern, it has
no need to reread the file; it just checks if the controling
preprocessor token is (still) defined. (There are
configurations where it is impossible for the compiler to detect
whether the included file is the same as an earlier included
file. In which case, it does have to read the file again, and
reparse it. Such cases are fairly rare, however.)
A header file is supposed to be treated like an API. Let us say you are writing a library for a client, you will provide them a header file for including in their code, and a compiled binary library for linking.
In such scenario, adding a '#include' directive in your header file will create a lot of problems for your client as well as you, because now you will have to provide unnecessary header files just to get stuff compiling. Forward declaring as much as possible enables cleaner API. It also enables your client to implement their own functions over your header if they want.
If you are sure that your header is never going to be used outside your current project, then either way is not a problem. Compilation time is also not a problem if you are using include guards, which you should have been using anyway.
Having more (unwanted) includes in headers means having more number of (unwanted) symbols visible at the interface level. This may create a hell lot of havocs, might lead to symbol collisions and bloated interface
On the other hand, if I have the includes in the include files compile time might get bigger, because even with the include guards
If your compiler doesn't remember which files have include guards and avoid re-opening and re-tokenising the file then get a better compiler. Most modern compilers have been doing this for many years, so there's no cost to including the same file multiple times (as long as it has include guards). See e.g. http://gcc.gnu.org/onlinedocs/cpp/Once_002dOnly-Headers.html
Headers should be self-sufficient and include/declare what they need. Expecting users of your header to include its dependencies is bad practice and a great way to make users hate you.
If my_needed_file.h is needed before sample.h (because sample.h requires declarations/definitions from it) then it should be included in sample.h, no question. If it's not needed in sample.h and only needed in sample.c then only include it there, and my preference is to include it after sample.h, that way if sample.h is missing any headers it needs then you'll know about it sooner:
// sample.c
#include "sample.h"
#include "my_needed_file.h"
#include ...
#include <std_header>
// ...
If you use this #include order then it forces you to make sample.h self-sufficient, which ensures you don't cause problems and annoyances for other users of the header.
I think second approach is a better one just because of following reason.
when you have a function template in your header file.
class myclass
{
template<class T>
void method(T& a)
{
...
}
}
And you don't want to use it in the source file for myclass.cxx. But you want to use it in xyz.cxx, if you go with your first approach then you will end up in including all files that are required for myclass.cxx, which is of no use for xyz.cxx.
That is all what I think of now the difference. So I would say one should go with second approach as it makes your code each to maintain in future.

C++ class redefinition error - Help me understand headers and linking

I started writing a simple interpreter in C++ with a class structure that I will describe below, but I quit and rewrote the thing in Java because headers were giving me a hard time. Here's the basic structure that is apparently not allowed in C++:
main.cpp contains the main function and includes a header for a class we can call printer.h (whose single void method is implemented in printer.cpp). Now imagine two other classes which are identical. Both want to call Printer::write_something();, so I included printer.h in each. So here's my first question: Why can I #include <iostream> a million times, even one after the other, but I can only include my header once? (Well, I think I could probably do the same thing with mine, as long as it's in the same file. But I may be wrong.) I understand the difference between a declaration and an implementation/definition, but that code gives me a class redefinition error. I don't see why. And here's the thing that blows my mind (and probably shows you why I don't understand any of this): I can't just include printer.h at the top of main.cpp and use the class from my other two classes. I know I can include printer.h in one of the two classes (headers) with no trouble, but I don't see why this is any different than just including it before I include the class in main.cpp (as doing so gives me a class not found error).
When I got fed up, I thought about moving to C since the OOP I was using was quite forced anyway, but I would run into the same problem unless I wrote everything in one file. It's frustrating to know C++ but be unable to use it correctly because of compilation issues.
I would really appreciate it if you could clear this up for me. Thanks!
Why can I #include a million times, even one after the other, but I can only include my header once?
It is probably because your header doesn't have an include guard.
// printer.h file
#ifndef PRINTER_H_
#define PRINTER_H_
// printer.h code goes here
#endif
Note that it is best practice to chose longer names for the include guard defines, to minimise the chance that two different headers might have the same one.
Most header files should be wrapped in an include guard:
#ifndef MY_UNIQUE_INCLUDE_NAME_H
#define MY_UNIQUE_INCLUDE_NAME_H
// All content here.
#endif
This way, the compiler will only see the header's contents once per translation unit.
C/C++ compilation is divided in compilation/translation units to generate object files. (.o, .obj)
see here the definition of translation unit
An #include directive in a C/C++ file results in the direct equivalent of a simple recursive copy-paste in the same file. You can try it as an experiment.
So if the same translation unit includes the same header twice the compiler sees that some entities are being defined multiple times, as it would happen if you write them in the same file. The error output would be exactly the same.
There is no built-in protection in the language that prevents you to do multiple inclusions, so you have to resort to write the include guard or specific #pragma boilerplate for each C/C++ header.

Does it matter whether I use standard library preprocessor includes in my header file vs. the corresponding implementation file?

If I have a project with
main .cpp
Knife .h and .cpp
Cucumber .h and .cpp
and I want to use Knife's members in Cucumber, does it matter whether I use the following:
#include "Knife.h"
#include <iostream>
using namespace std;
in Cucumber.h or Cucumber.cpp (assume that Cucumber.cpp already has an include for Cucumber.h)?
My recommendation is to minimize the number of files included in the header files.
So, if I have the choice, I prefer to include in the source file.
When you modify a header file, all the files that include this header file must be recompiled.
So, if cucumber.h includes knife.h, and main.cpp includes cucumber.h, and you modify knife.h, all the files will be recompiled (cucumber.cpp, knife.cpp and main.cpp).
If cucumber.cpp includes knife.h and main.cpp includes cucumber.h, and you modify knife.h, only cucumber.cpp and knife.cppwill be recompiled, so your compilation time is reduced.
If you need to use knife in cucumber you can proceed like this:
// Cucumber.hpp
#ifndef CUCUMBER_HPP
#define CUCUMBER_HPP
class Knife;
class Cucumber
{
public :
///...
private :
Knife* myKnife
}
#endif
//Cucumber.cpp
#include "Cucumber.hpp"
#include "Knife.hpp
// .. your code here
This "trick" is called "forward declaration". That is a well-known trick of C++ developers, who want to minimize compilation time.
yes, you should put it in the .cpp file.
you will have faster builds, fewer dependencies, and fewer artifacts and noise -- in the case of iostream, GCC declares:
// For construction of filebuffers for cout, cin, cerr, clog et. al.
static ios_base::Init __ioinit;
within namespace std. that declaration is going to produce a ton of (redundant) static data which must be constructed at startup, if Knife.h is included by many files. so there are a lot of wins, the only loss is that you must explicitly include the files you actually need -- every place you need them (which is also a very good thing in some regards).
The advice I've seen advocated was to only include the minimum necessary for a given source file, and to keep as many includes out of headers as possible. It potentially reduces the dependencies when it comes time to compile.
Another thing to note is your namespace usage. You definitely want to be careful about having that sort of thing in a header. It could change namespace usage in files you didn't plan on.
As a general rule, try to add your includes into the implementation file rather than the header for the following reasons:
Reduces potential unnecessary inclusion and keeps it to only where needed.
Reduces compile time (quite significantly in some cases I've seen.) Avoids over exposing implementation details.
Reduces risk of circular dependencies appearing.
Avoids locking users of your headers into having to indirectly include files that they may not want / need to.
If you reference a class by pointer or reference only in your header then it's not necessary to include the appropriate header; a class declaration will suffice.
Probably there are quite a few other reasons - the above are probably the most important / obvious ones.

About headers, forwards and how to organize a lot of includes

I have 3 classes (it could be 300) , each one with its own header and implementation.
I'd like to write an 'elegant' way to organize the way I load of any class needed by every class of the three. Maybe this example helps...
I have : class1 class2 class3
Every header has:
#ifndef CLASS#_H
#define CLASS#_H
#define FORWARD_STYLE
#include "general.h"
#endif
Every implementation has:
#define DIRECT_STYLE
#include "general.h"
OK
I'm going to write a 'general.h' file in which I'd have :
#ifndef DIRECT_STYLE
#ifndef CLASS1_H
#include "class1.h"
#endif
#ifndef CLASS2_H
#include "class2.h"
#endif
#ifndef CLASS3_H
#include "class3.h"
#endif
#endif
#ifndef FORWARD_STYLE
class Class1;
class Class2;
class Class3;
#endif
// a lot of other elements needed
#include <string.h>
#include <stdio.h"
....
#include <vector.h"
( all the class I need now and in the future )
This is a good structure ? Or I'm doing some idiot thing ?
My goal is having one unique 'general.h' file to write all the elemenst I need...
Are this to work fine ?
Thanks
The basic rules to follow are:
Let each of your source file include all the header files it needs for getting compiled in a standalone manner. Avoid letting the header files include in the source file indirectly through other files.
If you have constructs which will be needed across most source files then put them in a common header and include the header in Only in those source files which need it.
Use Forward declarations wherever you can.There are several restrictions of when you can get away using them,read this to know more about those scenarios.
Overall it is a good idea to avoid including unnecessary code in source files through a common header because it just results in code bloat, so try and keep it to a minimum. Including a header just actually copy pastes the entire header to your source file and Including unnecessary files has several disadvantages, namely:
Increase in compilation time
Pollution of global namespace.
Potential clash of preprocessor names.
Increase in Binary size(in some cases though not always)
This might like a fine idea now, but won't scale and should be avoided. Your general.h file will include a vast amount of files, and thus all files that include it will (a) take ages to compile or not compile at all due to memory restrictions and (b) will have to be re-compiled every time anything changes.
Directly include the headers you need in each file, and define a few forward declaration files, and you should be fine.
The #define in a header will probably be ok, but it can propagate through lots of sources and potentially cause problems. More seriously, any time general.h or any of its includes change your entire project rebuilds. For small projects this isn't an issue, for larger projects it will result in unacceptable build times.
Instead, I utilize a few guidelines:
In headers, forward declare what you can, either explicitly or with #include "blah_fwd.h" as seen in the standard library.
All headers should be able to compile on their own and not rely on the source file including something earlier. This can be easily detected by all source files always including their own header first.
In source files, include what you need (usually you can't get away with forward declarations in source files).
Also note to never use using in headers because it will pollute the global namespace.
If this seems like a lot of work, unfortunately that's because it is. This is a system inherited from C and requires some level of programmer maintenance. If you want to be able to decide at a high level what's used by your project and let the compiler/runtime figure it out, perhaps C++ isn't the right language for your project.