Managing forward declarations - c++

It's well known that using forward declarations is preferable to using #includes in header files, but what's the best way to manage forward declarations?
For a while, I was manually adding to each header file the forward declarations that were needed by that header file. However, I ended up with a bunch of header files repeating the same half-dozen or so forward declarations, which seems redundant, and maintaining these repeated lists got to be a bit tedious.
Forward declarations of typedefs (e.g., struct SensorRecordId; typedef std::vector<SensorRecordId> SensorRecordIdList;) is also a bit much to duplicate across multiple header files.
So then I made a ProjectForwards.h file that contains all of my forward declarations and included that wherever it was needed. At first, this seemed like a good idea - much less redundancy, and much easier maintenance of typedefs. But now, as a result of using ProjectForwards.h so heavily, whenever I add a new class to it, I have to rebuild the world, which slows development.
So what's the best way to manage forward declarations? Should I bite the bullet and repeat individual forward declarations across multiple subsystems? Continue with the ProjectForwards.h approach? Try to split ProjectForwards.h into several SubsystemForwards.h files? Some other solution I'm overlooking?

It sounds like these classes are fairly common to much of your project. You might try some of these:
Do your best to break apart ProjectForwards.h into several files as you suggested. Make sure each subsystem only gets the declarations it truly needs. If nothing else, that process will force you to think about the coupling between your subsystems and you might find ways to reduce it. These are all good steps toward avoiding over-compilation.
Mimic <iosfwd>. Have each common class or module provide its own forward-include header that just provides the class names and any convenience typedefs. Then you can #include that everywhere. Yes, you'll repeat the list a lot, but think about it this way: nobody complains about #including <vector>, <string>, and <map> in six different places in their code.
Use Pimpl more often. This will have a similar effect to my previous suggestion but will require more work on your part. If your interfaces are stable, then you can safely provide the typedefs in those headers and #include them directly.

In general:
Have a forwards file for users of your module. This will only declare those classes that appear as part of the API.
If you have commonly used forwards in your implementation you can have an implementation-only based forwards file.
You probably don't need a forward declaration for every class you use.

I've never seen a "header of forward declares" that was actually useful (noone uses it), didn't quickly become stale (full of stuff that noone uses), and wasn't an iteration bottleneck (touched the forward declare header? recompile everything!). Generally they develop all three problems.
The core of your problem is system design. These subsystems you've mentioned should probably be including the header files that define the types they need to take as input or output. By breaking types that are being used by multiple subsystems into their own header file you'll strike a nice balance between isolation and efficient interop between subsystems.

Having done a lot of brown field maintenance I've never been fond of includes that do nothing but include other files or have forward declarations. I prefer to just have them in the header file. You can reduce the typing with the use of templates if your tools support them.
You could write a template that expands into your desired text. I would probably include something to make it stand out like
///Begin Forwarding
...
///End Forwarding
That would make it easy to grab and replace if you change the template. If you're more comfortable with tools like grep you could automate the updating from a command line. It would probably be simple to write a script that would update all files, or only the files passed in on the command line. Just a thought.

I don't think there is a single "best" solution, each has its own advantages and drawbacks. Even though it's more work, I personally favor the "each header file has its own forward declarations" approach, for the following reasons:
It's as lean as it can get: No additional files that need to be found and parsed.
No obfuscation: Just by looking at the header file you see exactly which types it needs.
No unnecessary namespace pollution. If you collect forward declarations in a ProjectForwards.h file, that file will contain the sum of all declarations needed by all of its consumers. So if only a single consumer needs a certain declaration, all the others will inherit it, too.
If these arguments are not convincing, maybe because they are too puristic :-), then I would suggest following the middle way of splitting ProjectForwards.h.

Here's what I generally do:
It's well known that using forward declarations is preferable to using #includes in header files, but what's the best way to manage forward declarations?
Library: Provide a dedicated client forward: (e.g. #include "MONThread/include.fwd.hpp"). Keep Libraries focused (small-ish), and make implementations private where possible.
Executable: Forward declare on demand, unless it comes from a library -- always use the library's forward include. Recognize what should be a library (logical or physical) -- many forwards suggest this, as patterns will emerge. Also try to isolate what can be hidden in the process. With libraries and executables, there should be some use of package private types -- these types do not belong in the client's forward headers.
So then I made a ProjectForwards.h file that contains all of my forward declarations and included that wherever it was needed. At first, this seemed like a good idea - much less redundancy, and much easier maintenance of typedefs. But now, as a result of using ProjectForwards.h so heavily, whenever I add a new class to it, I have to rebuild the world, which slows development.
Usually, that means too many large libraries are visible in high levels of the include graph. An ideal include graph (of a large system) is much wider than it is tall -- including what it needs with minimal excess. If every TU needs a few 100,000 lines, you're beyond a problem -- start removing large libraries from high levels.
If that really sounds unsatisfactory, analyze your program's dependencies.
Many people make the mistake (in larger projects) of including a ton of large libraries for convenience (e.g. in the pch), which results in recompiling the world (and the pch).
Evaluate your dependencies from time to time -- set some soft sensible limits for line count of preprocessor output.
The forward headers replace local forward declarations. They do not (generally) belong in the pch.

I personally only include in the global ProjectForwards.h the declarations that are truly global to all, or mostly all, the program. It could also include other files that are almost always needed, for example:
#include <string>
#include <vector>
#include <boost/shared_ptr.hpp>
std::string get_installation_dir();
//...
That way this file rarely changes and there is not need to often rebuilds.
Also, if this file includes a bunch of standard headers, it would be a perfect candidate to be a pre-compiled header!

I was manually adding to each header file the forward declarations that were needed by that header file.
This is the only good way.
Also, if you have a typedef somewhere, it is better to somehow mask it. For example, instead of using a typedef like this :
typedef std::vector< MyClass > MyClassArray;
do this instead :
struct MyClassArray
{
std::vector< MyClass > t;
};
The bad thing is that you will not be able to use operators, so this will not always work. For example, if you have
typedef std::string MyString;
then it is better to go with typedef.
So then I made a ProjectForwards.h file that contains all of my forward declarations and included that wherever it was needed.
As you discovered, this is a very bad idea. Whenever you modify this header, you'll trigger the recompilation of all files that include it (directly or indirectly).

There is no escaping forward declaration where they are needed.
In your model, If each of your objects from one type communicate with other objects of another type using interfaces only then you will minimize the amount of forward declaration to interfaces only.
If you use templates then you can put your typedefs of them in the precompiled header file.

Related

Is it possible to write header file without include guard, and without multiple definition errors?

Just out of curiosity I wanted to know if is there a way to achieve this.
In C++ we learn that we should avoid using macros. But when we use include guards, we do use at least one macro. So I was wondering if there is a way to write a macro-free program.
It's definitely possible, though it's unimaginably bad practice not to have include guards. It's important to understand what the #include statement actually does: the contents of another file are pasted directly into your source file before it's compiled. An include guard prevents the same code from being pasted again.
Including a file only causes an error if it would be incorrect to type the contents of that file at the position you included it. As an example, you can declare (note: declare, not define) the same function (or class) multiple times in a single compilation unit. If your header file consists only of declarations, you don't need to specify an include guard.
IncludedFile.h
class SomeClassSomewhere;
void SomeExternalFunction(int x, char y);
Main.cpp
#include "IncludedFile.h"
#include "IncludedFile.h"
#include "IncludedFile.h"
int main(int argc, char **argv)
{
return 0;
}
While declaring a function (or class) multiple times is fine, it isn't okay to define the same function (or class) more than once. If there are two or more definitions for a function, the linker doesn't know which one to choose and gives up with a "multiply defined symbols" error.
In C++, it's very common for header files to include class definitions. An include guard prevents the #included file from being pasted into your source file a second time, which means your definitions will only appear once in the compiled code, and the linker won't be confused.
Rather than trying to figure out when you need to use them and when you don't, just always use include guards. Avoiding macros most of the time is a good idea; this is one situation where they aren't evil, and using them here isn't dangerous.
It is definitely doable and I have used some early C++ libraries which followed an already misguided approach from C which essentially required the user of a header to include certain other headers before this. This is based on thoroughly understanding what creates a dependency on what else and to use declarations rather than definitions wherever possible:
Declarations can be repeated multiple times although they are obviously required to be consistent and some entities can't be declared (e.g. enum can only be defined; in C++ 2011 it is possible to also declare enums).
Definitions can't be repeated but are only needed when the definition if really used. For example, using a pointer or a reference to a class doesn't need its definition but only its declaration.
The approach to writing headers would, thus, essentially consist of trying to avoid definitions as much as possible and only use declaration as far as possible: these can be repeated in a header file or corresponding headers can even be included multiple times. The primary need for definitions comes in when you need to derive from a base class: this can't be avoided and essentially means that the user would have to include the header for the base class before using any of the derived classes. The same is true for members defined directly in the class but using the pimpl-idiom the need for member definitions can be pushed to the implementation file.
Although there are a few advantages to this approach it also has a few severe drawbacks. The primary advantage is that it kind of enforces a very thorough separation and dependency management. On the other hand, overly aggressive separation e.g. using the pimpl-idiom for everything also has a negative performance impact. The biggest drawback is that a lot the implementation details are implicitly visible to the user of a header because the respective headers this one depends on need to be included first explicitly. At least, the compiler enforces that you get the order of include files right.
From a usability and dependency point of view I think there is a general consensus that headers are best self-contained and that the use of include guards is the lesser evil.
It is possible to do so if you ensure the same header file is not being included in the same translation unit multiple times.
Also, you could use:
#pragma once
if portability is not your concern.
However, you should avoid using #pragma once over Include Guards because:
It is not standard & hence non portable.
It is less intuitive and not all users might know of it.
It provides no big advantage over the classic and very well known Include Guards.
In short, yes, even without pragmas. Only if you can guarantee that every header file is included only once. However, given how code tends to grow, it becomes increasingly difficult to honour that guarantee as the number of header files increase. This is why not using header guards is considered bad practice.
Pre-processor macros are frowned upon, yes. However, header include guards are a necessary evil because the alternative is so much worse (#pragma once will only work if your compiler supports it, so you lose portability)
With regard to pre-processor macros, use this rule:
If you can come up with an elegant solution that does not involve a macro, then avoid them.
Does the non-portable, non-standard
#pragma once
work sufficiently well for you? Personally, I'd rather use macros for preventing reinclusion, but that's your decision.

Why is including a header file such an evil thing?

I have seen many explanations on when to use forward declarations over including header files, but few of them go into why it is important to do so. Some of the reasons I have seen include the following:
compilation speed
reducing complexity of header file management
removing cyclic dependencies
Coming from a .net background I find header management frustrating. I have this feeling I need to master forward declarations, but I have been scrapping by on includes so far.
Why cannot the compiler work for me and figure out my dependencies using one mechanism (includes)?
How do forward declarations speed up compilations since at some point the object referenced will need to be compiled?
I can buy the argument for reduced complexity, but what would a practical example of this be?
"to master forward declarations" is not a requirement, it's a useful guideline where possible.
When a header is included, and it pulls in more headers, and yet more, the compiler has to do a lot of work processing a single translation module.
You can see how much, for example, with gcc -E:
A single #include <iostream> gives my g++ 4.5.2 additional 18,560 lines of code to process.
A #include <boost/asio.hpp> adds another 74,906 lines.
A #include <boost/spirit/include/qi.hpp> adds 154,024 lines, that's over 5 MB of code.
This adds up, especially if carelessly included in some file that's included in every file of your project.
Sometimes going over old code and pruning unnecessary includes improves the compilation dramatically just because of that. Replacing includes with forward declarations in the translation modules where only references or pointers to some class are used, improves this even further.
Why cannot the compiler work for me and figure out my dependencies using one mechanism (includes)?
It cannot because, unlike some other languages, C++ has an ambiguous grammar:
int f(X);
Is it a function declaration or a variable definition? To answer this question the compiler must know what does X mean, so X must be declared before that line.
Because when you're doing something like this :
bar.h :
class Bar {
int foo(Foo &);
}
Then the compiler does not need to know how the Foo struct / class is defined ; so importing the header that defines Foo is useless. Moreover, importing the header that defines Foo might also need importing the header that defines some other class that Foo uses ; and this might mean importing the header that defines some other class, etc.... turtles all the way.
In the end, the file that the compiler is working against is almost like the result of copy pasting all the headers ; so it will get big for no good reason, and when someone makes a typo in a header file that you don't need (or import , or something like that), then compiling your class starts to take waaay too much time (or fail for no obvious reason).
So it's a good thing to give as little info as needed to the compiler.
How do forward declarations speed up compilations since at some point the object referenced will need to be compiled?
1) reduced disk i/o (fewer files to open, fewer times)
2) reduced memory/cpu usage
most translations need only a name. if you use/allocate the object, you'll need its declaration.
this is probably where it will click for you: each file you compile compiles what is visible in its translation.
a poorly maintained system will end up including a ton of stuff it does not need - then this gets compiled for every file it sees. by using forwards where possible, you can bypass that, and significantly reduce the number of times a public interface (and all of its included dependencies) must be compiled.
that is to say: the content of the header won't be compiled once. it will be compiled over and over. everything in this translation must be parsed, checked that it's a valid program, checked for warnings, optimized, etc. many, many times.
including lazily only adds significant disk/cpu/memory increase, which turns into intolerable build times for you, while introducing significant dependencies (in non-trivial projects).
I can buy the argument for reduced complexity, but what would a practical example of this be?
unnecessary includes introduce dependencies as side effects. when you edit an include (necessary or not), then every file which includes it must be recompiled (not trivial when hundreds of thousands of files must be unnecessarily opened and compiled).
Lakos wrote a good book which covers this in detail:
http://www.amazon.com/Large-Scale-Software-Design-John-Lakos/dp/0201633620/ref=sr_1_1?ie=UTF8&s=books&qid=1304529571&sr=8-1
Header file inclusion rules specified in this article will help reduce the effort in managing header files.
I used forward declarations simply to reduce the amount of navigation between source files done. e.g. if module X calls some glue or interface function F in module Y, then using a forward declaration means the writing the function and the call can be done by only visiting 2 places, X.c and Y.c not so much of an issue when a good IDE helps you navigate, but I tend to prefer coding bottom-up creating working code then figuring out how to wrap it rather than through top down interface specification.. as the interfaces themselves evolve it's handy to not have to write them out in full.
In C (or c++ minus classes) it's possible to truly keep structure details Private by only defining them in the source files that use them, and only exposing forward declarations to the outside world - a level of black boxing that requires performance-destroying virtuals in the c++/classes way of doing things. It's also possible to avoid needing to prototype things (visiting the header) by listing 'bottom-up' within the source files (good old static keyword).
The pain of managing headers can sometimes expose how modular your program is or isn't - if its' truly modular, the number of headers you have to visit and the amount of code & datastructures declared within them should be minimized.
Working on a big project with 'everything included everywhere' through precompiled headers won't encourage this real modularity.
module dependancies can correlate with data-flow relating to performance issues, i.e. both i-cache & d-cache issues. If a program involves many modules that call each other & modify data at many random places, it's likely to have poor cache-coherency - the process of optimizing such a program will often involve breaking up passes and adding intermediate data.. often playing havoc with many'class diagrams'/'frameworks' (or at least requiring the creation of many intermediates datastructures). Heavy template use often means complex pointer-chasing cache-destroying data structures. In its optimized state, dependancies & pointer chasing will be reduced.
I believe forward declarations speed up compilation because the header file is ONLY included where it is actually used. This reduces the need to open and close the file once. You are correct that at some point the object referenced will need to be compiled, but if I am only using a pointer to that object in my other .h file, why actually include it? If I tell the compiler I am using a pointer to a class, that's all it needs (as long as I am not calling any methods on that class.)
This is not the end of it. Those .h files include other .h files... So, for a large project, opening, reading, and closing, all the .h files which are included repetitively can become a significant overhead. Even with #IF checks, you still have to open and close them a lot.
We practice this at my source of employment. My boss explained this in a similar way, but I'm sure his explanation was more clear.
How do forward declarations speed up compilations since at some point the object referenced will need to be compiled?
Because include is a preprocessor thing, which means it is done via brute force when parsing the file. Your object will be compiled once (compiler) then linked (linker) as appropriate later.
In C/C++, when you compile, you've got to remember there is a whole chain of tools involved (preprocessor, compiler, linker plus build management tools like make or Visual Studio, etc...)
Good and evil. The battle continues, but now on the battle field of header files. Header files are a necessity and a feature of the language, but they can create a lot of unnecessary overhead if used in a non optimal way, e.g. not using forward declarations etc.
How do forward declarations speed up
compilations since at some point the
object referenced will need to be
compiled?
I can buy the argument for reduced
complexity, but what would a practical
example of this be?
Forward declarations are bad ass. My experience is that a lot of c++ programmers are not aware of the fact that you don't have to include any header file, unless you actually want to use some type, e.g. you need to have the type defined so the compiler understands what you want to do. It's important to try and refrain from including header files in other header files.
Just passing around a pointer from one function to another, only requires a forward declaration:
// someFile.h
class CSomeClass;
void SomeFunctionUsingSomeClass(CSomeClass* foo);
Including someFile.h does not require you to include the header file of CSomeClass, since you are merely passing a pointer to it, not using the class. This means that the compiler only needs to parse one line (class CSomeClass;) instead of an entire header file (that might be chained to other header files etc etc).
This reduces both compile time and link time, and we are talking big optimizations here if you have many headers and many classes.

Is it worth forward-declaring library classes?

I've just started learning Qt, using their tutorial. I'm currently on tutorial 7, where we've made a new LCDRange class. The implementation of LCDRange (the .cpp file) uses the Qt QSlider class, so in the .cpp file is
#include <QSlider>
but in the header is a forward declaration:
class QSlider;
According to Qt,
This is another classic trick, but one that's much less used often. Because we don't need QSlider in the interface of the class, only in the implementation, we use a forward declaration of the class in the header file and include the header file for QSlider in the .cpp file.
This makes the compilation of big projects much faster, because the compiler usually spends most of its time parsing header files, not the actual source code. This trick alone can often speed up compilations by a factor of two or more.
Is this worth doing? It seems to make sense, but it's one more thing to keep track of - I feel it would be much simpler just to include everything in the header file.
Absolutely. The C/C++ build model is ...ahem... an anachronism (to say the best). For large projects it becomes a serious PITA.
As Neil notes correctly, this should not be the default approach for your class design, don't go out of your way unless you really need to.
Breaking Circular include references is the one reason where you have to use forward declarations.
// a.h
#include "b.h"
struct A { B * a; }
// b.h
#include "a.h" // circlular include reference
struct B { A * a; }
// Solution: break circular reference by forward delcaration of B or A
Reducing rebuild time - Imagine the following code
// foo.h
#include <qslider>
class Foo
{
QSlider * someSlider;
}
now every .cpp file that directly or indirectly pulls in Foo.h also pulls in QSlider.h and all of its dependencies. That may be hundreds of .cpp files! (Precompiled headers help a bit - and sometimes a lot - but they turn disk/CPU pressure in memory/disk pressure, and thus are soon hitting the "next" limit)
If the header requires only a reference declaration, this dependency can often be limited to a few files, e.g. foo.cpp.
Reducing incremental build time - The effect is even more pronounced, when dealing with your own (rather than stable library) headers. Imagine you have
// bar.h
#include "foo.h"
class Bar
{
Foo * kungFoo;
// ...
}
Now if most of your .cpp's need to pull in bar.h, they also indirectly pull in foo.h. Thus, every change of foo.h triggers build of all these .cpp files (which might not even need to know Foo!). If bar.h uses a forward declaration for Foo instead, the dependency on foo.h is limited to bar.cpp:
// bar.h
class Foo;
class Bar
{
Foo * kungFoo;
// ...
}
// bar.cpp
#include "bar.h"
#include "foo.h"
// ...
It is so common that it is a pattern - the PIMPL pattern. It's use is two-fold: first it provides true interface/implementation isolation, the other is reducing build dependencies. In practice, I'd weight their usefulness 50:50.
You need a reference in the header, you can't have a direct instantiation of the dependent type. This limits the cases where forward declarations can be applied. If you do it explicitely, it is common to use a utility class (such as boost::scoped_ptr) for that.
Is Build Time worth it? Definitely, I'd say. In the worst case build time grows polynomial with the number of files in the project. other techniques - like faster machines and parallel builds - can provide only percentage gains.
The faster the build, the more often developers test what they did, the more often unit tests run, the faster build breaks can be found fixed, and less often developers end up procrastinating.
In practice, managing your build time, while essential on a large project (say, hundreds of source files), it still makes a "comfort difference" on small projects. Also, adding improvements after the fact is often an exercise in patience, as a single fix might shave off only seconds (or less) of a 40 minute build.
I use it all the time. My rule is if it doesn't need the header, then i put a forward declaration ("use headers if you must, use forward declarations if you can"). The only thing that sucks is that i need to know how the class was declared (struct/class, maybe if it is a template i need its parameters, ...). But in the vast majority of times, it just comes down to "class Slider;" or something along that. If something requires some more hassle to be just declared, one can always declare a special forward declare header like the Standard does with iosfwd too.
Not including the header file will not only reduce compile time but also will avoid polluting the namespace. Files including the header will thank you for including as little as possible so they can keep using a clean environment.
This is the rough plan:
/* --- --- --- Y.hpp */
class X;
class Y {
X *x;
};
/* --- --- --- Y.cpp */
#include <x.hpp>
#include <y.hpp>
...
There are smart pointers that are specifically designed to work with pointers to incomplete types. One very well known one is boost::shared_ptr.
Yes, it sure does help. Another thing to add to your repertoire is precompiled headers if you are worried about compilation time.
Look up FAQ 39.12 and 39.13
The standard library does this for some of the iostream classes in the standard header <iosfwd>. However, it is not a generally applicable technique - notice there are no such headers for the other standard library types, and it should not (IMHO) be your default approach to designing class heirarchies.
Although this eems to be a favourite "optimisation" for programmers, I suspect that like most optimisations, few of them have actually timed the build of their projects both with and without such declarations. My limited experiments in this area indicate that the use of pre-compiled headers in modern compilers makes it unecessary.
There is a HUGE difference in compile times for larger projects, even ones with carefully managed dependencies. You better get the habit of forward declaring and keep as much as possible out of header files, because at a lot of software shops which uses C++ it's required. The reason for why you don't see it all that much in the standard header files is because those make heavy use of templates, at which point forward declaring becomes hard. For MSVC you can use /P to take a look at how the preprocessed file looks before actual compilation. If you haven't done any forward declaration in your project it would probably be an interesting experience to see how much extra processing needs to be done.
In general, no.
I used to forward declare as much as I could, but no longer.
As far as Qt is concerned, you may notice that there is a <QtGui> include file that will pull in all the GUI Widgets. Also, there is a <QtCore>, <QtWebKit>, <QtNetwork> etc. There's a header file for each module. It seems the Qt team believes this is the preferred method also. They say so in their module documentation.
True, the compilation time may be increased. But in my experience its just not that much. And if it were, using precompiled headers would be the next step.
When you write ...
include "foo.h"
... you thereby instruct a conventional build system "Any time there is any change whatsover in the library file foo.h, discard this compilation unit and rebuild it, even if all that happened to foo.h was the addition of a comment, or the addition of a comment to some file which foo.h includes; even if all that happened was some ultra-fastidious colleague re-balanced the curly braces; even if nothing happened other than a pressured colleague checked in foo.h unchanged and inadvertently changed its timestamp."
Why would you want to issue such a command? Library headers, because in general they have more human readers than application headers, have a special vulnerability to changes that have no impact on the binary, such as improved documentation of functions and arguments or the bump of a version number or copyright date.
The C++ rules allow namespace to be re-opened at any point in a compilation unit (unlike a struct or class) in order to support forward declaration.
Forward declarations are very useful for breaking the circular dependencies, and sometimes may be ok to use with your own code, but using them with library code may break the program on another platform or with other versions of the library (this will happen even with your code if you're not careful enough). IMHO not worth it.

Is it a good practice to place C++ definitions in header files?

My personal style with C++ has always to put class declarations in an include file, and definitions in a .cpp file, very much like stipulated in Loki's answer to C++ Header Files, Code Separation. Admittedly, part of the reason I like this style probably has to do with all the years I spent coding Modula-2 and Ada, both of which have a similar scheme with specification files and body files.
I have a coworker, much more knowledgeable in C++ than I, who is insisting that all C++ declarations should, where possible, include the definitions right there in the header file. He's not saying this is a valid alternate style, or even a slightly better style, but rather this is the new universally-accepted style that everyone is now using for C++.
I'm not as limber as I used to be, so I'm not really anxious to scrabble up onto this bandwagon of his until I see a few more people up there with him. So how common is this idiom really?
Just to give some structure to the answers: Is it now The Way™, very common, somewhat common, uncommon, or bug-out crazy?
Your coworker is wrong, the common way is and always has been to put code in .cpp files (or whatever extension you like) and declarations in headers.
There is occasionally some merit to putting code in the header, this can allow more clever inlining by the compiler. But at the same time, it can destroy your compile times since all code has to be processed every time it is included by the compiler.
Finally, it is often annoying to have circular object relationships (sometimes desired) when all the code is the headers.
Bottom line, you were right, he is wrong.
EDIT: I have been thinking about your question. There is one case where what he says is true. templates. Many newer "modern" libraries such as boost make heavy use of templates and often are "header only." However, this should only be done when dealing with templates as it is the only way to do it when dealing with them.
EDIT: Some people would like a little more clarification, here's some thoughts on the downsides to writing "header only" code:
If you search around, you will see quite a lot of people trying to find a way to reduce compile times when dealing with boost. For example: How to reduce compilation times with Boost Asio, which is seeing a 14s compile of a single 1K file with boost included. 14s may not seem to be "exploding", but it is certainly a lot longer than typical and can add up quite quickly when dealing with a large project. Header only libraries do affect compile times in a quite measurable way. We just tolerate it because boost is so useful.
Additionally, there are many things which cannot be done in headers only (even boost has libraries you need to link to for certain parts such as threads, filesystem, etc). A Primary example is that you cannot have simple global objects in header only libs (unless you resort to the abomination that is a singleton) as you will run into multiple definition errors. NOTE: C++17's inline variables will make this particular example doable in the future.
As a final point, when using boost as an example of header only code, a huge detail often gets missed.
Boost is library, not user level code. so it doesn't change that often. In user code, if you put everything in headers, every little change will cause you to have to recompile the entire project. That's a monumental waste of time (and is not the case for libraries that don't change from compile to compile). When you split things between header/source and better yet, use forward declarations to reduce includes, you can save hours of recompiling when added up across a day.
The day C++ coders agree on The Way, lambs will lie down with lions, Palestinians will embrace Israelis, and cats and dogs will be allowed to marry.
The separation between .h and .cpp files is mostly arbitrary at this point, a vestige of compiler optimizations long past. To my eye, declarations belong in the header and definitions belong in the implementation file. But, that's just habit, not religion.
Code in headers is generally a bad idea since it forces recompilation of all files that includes the header when you change the actual code rather than the declarations. It will also slow down compilation since you'll need to parse the code in every file that includes the header.
A reason to have code in header files is that it's generally needed for the keyword inline to work properly and when using templates that's being instanced in other cpp files.
What might be informing you coworker is a notion that most C++ code should be templated to allow for maximum usability. And if it's templated, then everything will need to be in a header file, so that client code can see it and instantiate it. If it's good enough for Boost and the STL, it's good enough for us.
I don't agree with this point of view, but it may be where it's coming from.
I think your co-worker is smart and you are also correct.
The useful things I found that putting everything into the headers is that:
No need for writing & sync headers and sources.
The structure is plain and no circular dependencies force the coder to make a "better" structure.
Portable, easy to embedded to a new project.
I do agree with the compiling time problem, but I think we should notice that:
The change of source file are very likely to change the header files which leads to the whole project be recompiled again.
Compiling speed is much faster than before. And if you have a project to be built with a long time and high frequency, it may indicates that your project design has flaws. Seperate the tasks into different projects and module can avoid this problem.
Lastly I just wanna support your co-worker, just in my personal view.
Often I'll put trivial member functions into the header file, to allow them to be inlined. But to put the entire body of code there, just to be consistent with templates? That's plain nuts.
Remember: A foolish consistency is the hobgoblin of little minds.
As Tuomas said, your header should be minimal. To be complete I will expand a bit.
I personally use 4 types of files in my C++ projects:
Public:
Forwarding header: in case of templates etc, this file get the forwarding declarations that will appear in the header.
Header: this file includes the forwarding header, if any, and declare everything that I wish to be public (and defines the classes...)
Private:
Private header: this file is a header reserved for implementation, it includes the header and declares the helper functions / structures (for Pimpl for example or predicates). Skip if unnecessary.
Source file: it includes the private header (or header if no private header) and defines everything (non-template...)
Furthermore, I couple this with another rule: Do not define what you can forward declare. Though of course I am reasonable there (using Pimpl everywhere is quite a hassle).
It means that I prefer a forward declaration over an #include directive in my headers whenever I can get away with them.
Finally, I also use a visibility rule: I limit the scopes of my symbols as much as possible so that they do not pollute the outer scopes.
Putting it altogether:
// example_fwd.hpp
// Here necessary to forward declare the template class,
// you don't want people to declare them in case you wish to add
// another template symbol (with a default) later on
class MyClass;
template <class T> class MyClassT;
// example.hpp
#include "project/example_fwd.hpp"
// Those can't really be skipped
#include <string>
#include <vector>
#include "project/pimpl.hpp"
// Those can be forward declared easily
#include "project/foo_fwd.hpp"
namespace project { class Bar; }
namespace project
{
class MyClass
{
public:
struct Color // Limiting scope of enum
{
enum type { Red, Orange, Green };
};
typedef Color::type Color_t;
public:
MyClass(); // because of pimpl, I need to define the constructor
private:
struct Impl;
pimpl<Impl> mImpl; // I won't describe pimpl here :p
};
template <class T> class MyClassT: public MyClass {};
} // namespace project
// example_impl.hpp (not visible to clients)
#include "project/example.hpp"
#include "project/bar.hpp"
template <class T> void check(MyClass<T> const& c) { }
// example.cpp
#include "example_impl.hpp"
// MyClass definition
The lifesaver here is that most of the times the forward header is useless: only necessary in case of typedef or template and so is the implementation header ;)
To add more fun you can add .ipp files which contain the template implementation (that is being included in .hpp), while .hpp contains the interface.
As apart from templatized code (depending on the project this can be majority or minority of files) there is normal code and here it is better to separate the declarations and definitions. Provide also forward-declarations where needed - this may have effect on the compilation time.
Generally, when writing a new class, I will put all the code in the class, so I don't have to look in another file for it.. After everything is working, I break the body of the methods out into the cpp file, leaving the prototypes in the hpp file.
I personally do this in my header files:
// class-declaration
// inline-method-declarations
I don't like mixing the code for the methods in with the class as I find it a pain to look things up quickly.
I would not put ALL of the methods in the header file. The compiler will (normally) not be able to inline virtual methods and will (likely) only inline small methods without loops (totally depends on the compiler).
Doing the methods in the class is valid... but from a readablilty point of view I don't like it. Putting the methods in the header does mean that, when possible, they will get inlined.
I think that it's absolutely absurd to put ALL of your function definitions into the header file. Why? Because the header file is used as the PUBLIC interface to your class. It's the outside of the "black box".
When you need to look at a class to reference how to use it, you should look at the header file. The header file should give a list of what it can do (commented to describe the details of how to use each function), and it should include a list of the member variables. It SHOULD NOT include HOW each individual function is implemented, because that's a boat load of unnecessary information and only clutters the header file.
If this new way is really The Way, we might have been running into different direction in our projects.
Because we try to avoid all unnecessary things in headers. That includes avoiding header cascade. Code in headers will propably need some other header to be included, which will need another header and so on. If we are forced to use templates, we try avoid littering headers with template stuff too much.
Also we use "opaque pointer"-pattern when applicable.
With these practices we can do faster builds than most of our peers. And yes... changing code or class members will not cause huge rebuilds.
I put all the implementation out of the class definition. I want to have the doxygen comments out of the class definition.
IMHO, He has merit ONLY if he's doing templates and/or metaprogramming. There's plenty of reasons already mentioned that you limit header files to just declarations. They're just that... headers. If you want to include code, you compile it as a library and link it up.
Doesn't that really depends on the complexity of the system, and the in-house conventions?
At the moment I am working on a neural network simulator that is incredibly complex, and the accepted style that I am expected to use is:
Class definitions in classname.h
Class code in classnameCode.h
executable code in classname.cpp
This splits up the user-built simulations from the developer-built base classes, and works best in the situation.
However, I'd be surprised to see people do this in, say, a graphics application, or any other application that's purpose is not to provide users with a code base.
Template code should be in headers only. Apart from that all definitions except inlines should be in .cpp. The best argument for this would be the std library implementations which follow the same rule. You would not disagree the std lib developers would be right regarding this.
I think your co-worker is right as long as he does not enter in the process to write executable code in the header.
The right balance, I think, is to follow the path indicated by GNAT Ada where the .ads file gives a perfectly adequate interface definition of the package for its users and for its childs.
By the way Ted, have you had a look on this forum to the recent question on the Ada binding to the CLIPS library you wrote several years ago and which is no more available (relevant Web pages are now closed). Even if made to an old Clips version, this binding could be a good start example for somebody willing to use the CLIPS inference engine within an Ada 2012 program.

Your preferred C/C++ header policy for big projects? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
When working on a big C/C++ project, do you have some specific rules regarding the #include within source or header files?
For instance, we can imagine to follow one of these two excessive rules:
#include are forbidden in .h files; it is up to each .c file to include all the headers it needs
Each .h file should include all its dependancies, i.e. it should be able to compile alone without any error.
I suppose there is trade-off in between for any project, but what is yours? Do you have more specific rules? Or any link that argues for any of the solutions?
If you include H-files exclusively into C-files, then including a H-file into a C-file might cause compilation to fail. It might fail because you may have to include 20 other H-files upfront, and even worse, you have to include them in the right order. With a real lot of H-files, this system ends up to be an administrative nightmare in the long run. All you wanted to do was including one H-file and you ended up spending two hours to find out which other H-files in which order you will need to include as well.
If a H-file can only be successfully included into a C-file in case another H-file is included first, then the first H-file should include the second one and so on. That way you can simply include every H-file into every C-file you like without having to fear that this may break compilation. That way you only specify your direct dependencies, yet if these dependencies themselves also have dependencies, its up to them to specify those.
On the other hand, don't include H-files into H-files if that isn't necessary. hashtable.h should only include other header files that are required to use your hashtable implementation. If the implementation itself needs hashing.h, then include it in hashtable.c, not in hashtable.h, as only the implementation needs it, not the code that only would like to use the final hashtable.
I think both suggested rules are bad. In my part I always apply:
Include only the header files required to compile a file using only what is defined in this header. This means:
All objects present as reference or pointers only should be forward-declared
Include all headers defining functions or objects used in the header itself.
I would use the rule 2:
All Headers should be self-sufficient, be it by:
not using anything defined elsewhere
forward declaring symbols defined elsewhere
including the headers defining the symbols that can't be forward-declared.
Thus, if you have an empty C/C++ source file, including an header should compile correctly.
Then, in the C/C++ source file, include only what is necessary: If HeaderA forward-declared a symbol defined in HeaderB, and that you use this symbol you'll have to include both... The good news being that if you don't use the forward-declared symbol, then you'll be able to include only HeaderA, and avoid including HeaderB.
Note that playing with templates makes this verification "empty source including your header should compile" somewhat more complicated (and amusing...)
The first rule will fail as soon as there are circular dependencies. So it cannot be applied strictly.
(This can still be made to work but this shifts a whole lot of work from the programmer to the consumer of these libraries which is obviously wrong.)
I'm all in favour of rule 2 (although it might be good to include “forward declaration headers” instead of the real deal, as in <iosfwd> because this reduces compile time). Generally, I believe it's a kind of self-documentation if a header file “declares” what dependencies it has – and what better way to do this than to include the required files?
EDIT:
In the comments, I've been challenged that circular dependencies between headers are a sign of bad design and should be avoided.
That's not correct. In fact, circular dependencies between classes may be unavoidable and aren't a sign of bad design at all. Examples are abundant, let me just mention the Observer pattern which has a circular reference between the observer and the subject.
To resolve the circularity between classes, you have to employ forward declaration because the order of declaration matters in C++. Now, it is completely acceptable to handle this forward declaration in a circular manner to reduce the number of overall files and to centralize code. Admittedly, the following case doesn't merit from this scenario because there's only a single forward declaration. However, I've worked on a library where this has been much more.
// observer.hpp
class Observer; // Forward declaration.
#ifndef MYLIB_OBSERVER_HPP
#define MYLIB_OBSERVER_HPP
#include "subject.hpp"
struct Observer {
virtual ~Observer() = 0;
virtual void Update(Subject* subject) = 0;
};
#endif
// subject.hpp
#include <list>
struct Subject; // Forward declaration.
#ifndef MYLIB_SUBJECT_HPP
#define MYLIB_SUBJECT_HPP
#include "observer.hpp"
struct Subject {
virtual ~Subject() = 0;
void Attach(Observer* observer);
void Detach(Observer* observer);
void Notify();
private:
std::list<Observer*> m_Observers;
};
#endif
A minimal version of 2. .h files include only the header files it specifically requires to compile, using forward declaration and pimpl as much as is practical.
Always have some sort of header guard.
Do not pollute the user's global namespace by putting any using namespace statements in a header.
I'd recommend going with the second option. You often end up in the situation where you want to add somwhing to a header file that suddenly requires another header file. And with the first option, you would have to go through and update lots of C files, sometimes not even under your control. With the second option, you simply update the header file, and the users who don't even need the new functionality you just added needn't even know you did it.
The first alternative (no #includes in headers) is a major no-no for me. I want to freely #include whatever I might need without worrying about manually #includeing its dependencies as well. So, in general, I follow the second rule.
Regarding cyclic dependencies, my personal solution is to structure my projects in terms of modules rather than in terms of classes. Inside a module, all types and functions may have arbitrary dependencies on one another. Across module boundaries, there may not be circular dependencies between modules. For each module, there is a single *.hpp file and a single *.cpp file. This ensures that any forward declarations (necessary for circular dependencies, which can only happen inside a module) in a header are ultimately always resolved inside the same header. There is no need for forward-declaration-only headers whatsoever.
Pt. 1 fails when you would like to have precompiled headers through a certain header; eg. this is what StdAfx.h are for in VisualStudio: you put all common headers there...
This comes down to interface design:
Always pass by reference or pointer. If you aren't going to check the pointer, pass by
reference.
Forward declare as much as possible.
Never use new in a class - create factories to do that for you and pass them to the class.
Never use pre-compiled headers.
In Windows my stdafx only ever includes afx___.h headers - no string, vector or boost libraries.
Rule nr. 1 would require you to list your header files in a very specific order (include files of base classes must go before include files of derived classes, etc), which would easily lead to compilation errors if you get the order wrong.
The trick is, as several others have mentioned, use forward declarations as much as possible, i.e. if references or pointers are used. To minimize build dependencies in this way the pimpl idiom can be useful.
I agree with Mecki, to put it shorter,
for every foo.h in your project include only those headers that are required to make
// foo.c
#include "any header"
// end of foo.c
compile.
(When using precompiled headers, they are allowed, of course - e.g. the #include "stdafx.h" in MSVC)
Personally I do it this way:
1 Perfer forward declare to include other .h files in a .h file. If something could be used as pointer/reference in that .h files or class, forward declare is possible without compile error. This could make headers less include dependencies(save compile time? not sure:( ).
2 Make .h files simple or specific. e.g. it is bad to define all constances in a file called CONST.h, it is better to divid them into multiple ones like CONST_NETWORK.h, CONST_DB.h. So to use one constance of DB, it needn't to include other information about network.
3 Don't put implementation in headers. Headers are used to quick review public things for other people; when implementing them, don't pollute declaration with detail for others.