I am used to putting header guards around my objects like:
#ifndef SOMETHING_H
#define SOMETHING_H
class Something {
...
}
#endif
but I have been given code where they also do:
#ifndef SOMETHING_H
#include "something.h"
#endif
for every include. Supposedly, this is better. Why? Is this redundant with guards around the object?
This is discussed in pretty good detail here:
http://c2.com/cgi/wiki?RedundantIncludeGuards
Here are the highlights:
Yes this is redundant, but for some compilers it may be faster because the compiler will avoid opening the header file if it doesn't need to.
"Good compilers make this idiom unnecessary. They notice the header is using the include-guard idiom (that is, that all non-comment code in the file is bracketed with the #ifndef). They store an internal table of header files and guard macros. Before opening any file they check the current value of the guard and then skip the entire file."
"Redundant guards have several drawbacks. They make include sections significantly harder to read. They are, well, redundant. They leak the guard name, which should be a secret implementation detail of the header. If, for example, someone renames the guard they might forget to update all the places where the guard name is assumed. Finally, they go wrong if anyone adds code outside of the guard. And of course, they are just a compile-time efficiency hack. Use only when all else fails."
The thinking behind it is the preprocessor will not need to open the header file and read the contents to determine that that header has been previously included, thus saving some time during compilation. However, most compilers these days are already smart enough to spot multiple inclusions of the same file and ignore subsequent occurrences.
It's good to have this on header and class definition files, so that on compilation, if a file is referenced in a loop (a.cpp references a.h and b.cpp, and b.cpp also references a.h, a.h will not be read again) or other similar cases.
The case that worries me most about what looks like your question is that the same constant name is being defined in different files, and possibly preventing the compiler from seeing some necessary-to-see constants, classes, types, etc. as it will "believe" that the file was "already read".
Long story short, put different #ifndef constants in different files to prevent confusion.
The purpose of doing this is to save on compile time. When the compile sees #include "something.h", it has to go out and fetch the file. If it does that ten times and the last nine all basically amount to:
#if 0
...
#endif
then you're paying the cost of finding the file and fetching it from disk nine times for no real benefit. (Technically speaking, the compiler can pull tricks to try and reduce or eliminate this cost, but that's the idea behind it.)
For small programs, the saving probably aren't very significant, and there isn't much benefit to doing it. For large programs consisting of thousands of files, it isn't uncommon for compilation to take hours, and this trick can shave off substantial amounts of time. Personally, it's not something I would do until compilation time starts becoming a real issue, and like any optimization I would look carefully at where the real costs are before running around making a bunch of changes.
Related
Just out of curiosity I wanted to know if is there a way to achieve this.
In C++ we learn that we should avoid using macros. But when we use include guards, we do use at least one macro. So I was wondering if there is a way to write a macro-free program.
It's definitely possible, though it's unimaginably bad practice not to have include guards. It's important to understand what the #include statement actually does: the contents of another file are pasted directly into your source file before it's compiled. An include guard prevents the same code from being pasted again.
Including a file only causes an error if it would be incorrect to type the contents of that file at the position you included it. As an example, you can declare (note: declare, not define) the same function (or class) multiple times in a single compilation unit. If your header file consists only of declarations, you don't need to specify an include guard.
IncludedFile.h
class SomeClassSomewhere;
void SomeExternalFunction(int x, char y);
Main.cpp
#include "IncludedFile.h"
#include "IncludedFile.h"
#include "IncludedFile.h"
int main(int argc, char **argv)
{
return 0;
}
While declaring a function (or class) multiple times is fine, it isn't okay to define the same function (or class) more than once. If there are two or more definitions for a function, the linker doesn't know which one to choose and gives up with a "multiply defined symbols" error.
In C++, it's very common for header files to include class definitions. An include guard prevents the #included file from being pasted into your source file a second time, which means your definitions will only appear once in the compiled code, and the linker won't be confused.
Rather than trying to figure out when you need to use them and when you don't, just always use include guards. Avoiding macros most of the time is a good idea; this is one situation where they aren't evil, and using them here isn't dangerous.
It is definitely doable and I have used some early C++ libraries which followed an already misguided approach from C which essentially required the user of a header to include certain other headers before this. This is based on thoroughly understanding what creates a dependency on what else and to use declarations rather than definitions wherever possible:
Declarations can be repeated multiple times although they are obviously required to be consistent and some entities can't be declared (e.g. enum can only be defined; in C++ 2011 it is possible to also declare enums).
Definitions can't be repeated but are only needed when the definition if really used. For example, using a pointer or a reference to a class doesn't need its definition but only its declaration.
The approach to writing headers would, thus, essentially consist of trying to avoid definitions as much as possible and only use declaration as far as possible: these can be repeated in a header file or corresponding headers can even be included multiple times. The primary need for definitions comes in when you need to derive from a base class: this can't be avoided and essentially means that the user would have to include the header for the base class before using any of the derived classes. The same is true for members defined directly in the class but using the pimpl-idiom the need for member definitions can be pushed to the implementation file.
Although there are a few advantages to this approach it also has a few severe drawbacks. The primary advantage is that it kind of enforces a very thorough separation and dependency management. On the other hand, overly aggressive separation e.g. using the pimpl-idiom for everything also has a negative performance impact. The biggest drawback is that a lot the implementation details are implicitly visible to the user of a header because the respective headers this one depends on need to be included first explicitly. At least, the compiler enforces that you get the order of include files right.
From a usability and dependency point of view I think there is a general consensus that headers are best self-contained and that the use of include guards is the lesser evil.
It is possible to do so if you ensure the same header file is not being included in the same translation unit multiple times.
Also, you could use:
#pragma once
if portability is not your concern.
However, you should avoid using #pragma once over Include Guards because:
It is not standard & hence non portable.
It is less intuitive and not all users might know of it.
It provides no big advantage over the classic and very well known Include Guards.
In short, yes, even without pragmas. Only if you can guarantee that every header file is included only once. However, given how code tends to grow, it becomes increasingly difficult to honour that guarantee as the number of header files increase. This is why not using header guards is considered bad practice.
Pre-processor macros are frowned upon, yes. However, header include guards are a necessary evil because the alternative is so much worse (#pragma once will only work if your compiler supports it, so you lose portability)
With regard to pre-processor macros, use this rule:
If you can come up with an elegant solution that does not involve a macro, then avoid them.
Does the non-portable, non-standard
#pragma once
work sufficiently well for you? Personally, I'd rather use macros for preventing reinclusion, but that's your decision.
I've been wondering if the msvc++ 2008 compiler takes care of multiple header includes of the same file, considering this example:
main.cpp
#include "header.h"
#include "header.h"
Will the compiler include this file multiple times or just one? (I'm aware I can use the #ifndef "trick" to prevent this from happening)
Also, if I include "header.h" which contains 10 functions, but I only call or use 2, will it still include all 10 or just the 2 I need and all of their needs?
#include is basically a synonym for "copy-and-paste". If you do identical #includes, the contents of that header file will be copy-and-pasted twice, sequentially.
As to your second question, it doesn't really make sense. #includes are executed by the preprocessor, which runs before the compiler and the linker. The preprocessor doesn't know or care what the content of the header file is, it simply copy-and-pastes it in. The linker may be able to eliminate unnecessary functions, but that's completely independent of the preprocessor.
No, the compiler (or, more accurately, the pre-processor) doesn't take care of this "automatically". Not in Visual C++ 2008, or in any other version. And you really wouldn't want it to.
There are two standard ways of going about this. You should choose one of them.
The first is known as include guards. That's the "#ifndef trick" you mentioned in your question. But it's certainly not a "trick". It's the standard idiom for handling this situation when writing C++ code, and any other programmer who looks at your source file will almost certainly expect to see include guards in there somewhere.
The other takes advantage of a VC++ feature (one that's also found its way into several other C++ toolkits) to do essentially the same thing in a way that's somewhat easier to type. By including the line #pragma once at the top of your header file, you instruct the pre-processor to only include the header file once per translation unit. This has some other advantages over include guards, but they're not particularly relevant here.
As for your second question, the linker will take care of "optimizing" out functions that you never call in your code. But this is the last phase of compilation, and has nothing to do with #include, which is handled by the pre-processor, as I mentioned above.
The MSVC 20xx preprocessor (not the compiler -- the compiler never sees preprocessor directives) does not in any sense "take care of" multiple #includes of the same file. If a file is #included twice, the preprocessor obeys the #includes and includes the file two times. (Just imagine the chaos if the preprocessor even thought about trying to correct your source file's "bad" #include behavior.)
Because the preprocessor is so meticulous and careful about following your instructions, each #included file must protect itself from being #included twice. That protection is what we see when we find lines like these at the top of a header file:
#ifndef I_WAS_ALREADY_INCLUDED // if not defined, continue with include
#define I_WAS_ALREADY_INCLUDED // but make sure I'm not included again
[ header-file real contents ]
#endif // I_WAS_ALREADY_INCLUDED
When you write a header file, you must always be sure to protect it in this way.
Why do you care? It doesn't add really much burden on the compiler because the compiler conditionally (with #ifdefs, for example) excludes code it doesn't need to compile.
Preprocessor will include 2 times these headers. Thats why guards in header files are required.
As far as I know the linker is most cases will remove code (functions) that are newer used to reduce executable file size.
I have seen many explanations on when to use forward declarations over including header files, but few of them go into why it is important to do so. Some of the reasons I have seen include the following:
compilation speed
reducing complexity of header file management
removing cyclic dependencies
Coming from a .net background I find header management frustrating. I have this feeling I need to master forward declarations, but I have been scrapping by on includes so far.
Why cannot the compiler work for me and figure out my dependencies using one mechanism (includes)?
How do forward declarations speed up compilations since at some point the object referenced will need to be compiled?
I can buy the argument for reduced complexity, but what would a practical example of this be?
"to master forward declarations" is not a requirement, it's a useful guideline where possible.
When a header is included, and it pulls in more headers, and yet more, the compiler has to do a lot of work processing a single translation module.
You can see how much, for example, with gcc -E:
A single #include <iostream> gives my g++ 4.5.2 additional 18,560 lines of code to process.
A #include <boost/asio.hpp> adds another 74,906 lines.
A #include <boost/spirit/include/qi.hpp> adds 154,024 lines, that's over 5 MB of code.
This adds up, especially if carelessly included in some file that's included in every file of your project.
Sometimes going over old code and pruning unnecessary includes improves the compilation dramatically just because of that. Replacing includes with forward declarations in the translation modules where only references or pointers to some class are used, improves this even further.
Why cannot the compiler work for me and figure out my dependencies using one mechanism (includes)?
It cannot because, unlike some other languages, C++ has an ambiguous grammar:
int f(X);
Is it a function declaration or a variable definition? To answer this question the compiler must know what does X mean, so X must be declared before that line.
Because when you're doing something like this :
bar.h :
class Bar {
int foo(Foo &);
}
Then the compiler does not need to know how the Foo struct / class is defined ; so importing the header that defines Foo is useless. Moreover, importing the header that defines Foo might also need importing the header that defines some other class that Foo uses ; and this might mean importing the header that defines some other class, etc.... turtles all the way.
In the end, the file that the compiler is working against is almost like the result of copy pasting all the headers ; so it will get big for no good reason, and when someone makes a typo in a header file that you don't need (or import , or something like that), then compiling your class starts to take waaay too much time (or fail for no obvious reason).
So it's a good thing to give as little info as needed to the compiler.
How do forward declarations speed up compilations since at some point the object referenced will need to be compiled?
1) reduced disk i/o (fewer files to open, fewer times)
2) reduced memory/cpu usage
most translations need only a name. if you use/allocate the object, you'll need its declaration.
this is probably where it will click for you: each file you compile compiles what is visible in its translation.
a poorly maintained system will end up including a ton of stuff it does not need - then this gets compiled for every file it sees. by using forwards where possible, you can bypass that, and significantly reduce the number of times a public interface (and all of its included dependencies) must be compiled.
that is to say: the content of the header won't be compiled once. it will be compiled over and over. everything in this translation must be parsed, checked that it's a valid program, checked for warnings, optimized, etc. many, many times.
including lazily only adds significant disk/cpu/memory increase, which turns into intolerable build times for you, while introducing significant dependencies (in non-trivial projects).
I can buy the argument for reduced complexity, but what would a practical example of this be?
unnecessary includes introduce dependencies as side effects. when you edit an include (necessary or not), then every file which includes it must be recompiled (not trivial when hundreds of thousands of files must be unnecessarily opened and compiled).
Lakos wrote a good book which covers this in detail:
http://www.amazon.com/Large-Scale-Software-Design-John-Lakos/dp/0201633620/ref=sr_1_1?ie=UTF8&s=books&qid=1304529571&sr=8-1
Header file inclusion rules specified in this article will help reduce the effort in managing header files.
I used forward declarations simply to reduce the amount of navigation between source files done. e.g. if module X calls some glue or interface function F in module Y, then using a forward declaration means the writing the function and the call can be done by only visiting 2 places, X.c and Y.c not so much of an issue when a good IDE helps you navigate, but I tend to prefer coding bottom-up creating working code then figuring out how to wrap it rather than through top down interface specification.. as the interfaces themselves evolve it's handy to not have to write them out in full.
In C (or c++ minus classes) it's possible to truly keep structure details Private by only defining them in the source files that use them, and only exposing forward declarations to the outside world - a level of black boxing that requires performance-destroying virtuals in the c++/classes way of doing things. It's also possible to avoid needing to prototype things (visiting the header) by listing 'bottom-up' within the source files (good old static keyword).
The pain of managing headers can sometimes expose how modular your program is or isn't - if its' truly modular, the number of headers you have to visit and the amount of code & datastructures declared within them should be minimized.
Working on a big project with 'everything included everywhere' through precompiled headers won't encourage this real modularity.
module dependancies can correlate with data-flow relating to performance issues, i.e. both i-cache & d-cache issues. If a program involves many modules that call each other & modify data at many random places, it's likely to have poor cache-coherency - the process of optimizing such a program will often involve breaking up passes and adding intermediate data.. often playing havoc with many'class diagrams'/'frameworks' (or at least requiring the creation of many intermediates datastructures). Heavy template use often means complex pointer-chasing cache-destroying data structures. In its optimized state, dependancies & pointer chasing will be reduced.
I believe forward declarations speed up compilation because the header file is ONLY included where it is actually used. This reduces the need to open and close the file once. You are correct that at some point the object referenced will need to be compiled, but if I am only using a pointer to that object in my other .h file, why actually include it? If I tell the compiler I am using a pointer to a class, that's all it needs (as long as I am not calling any methods on that class.)
This is not the end of it. Those .h files include other .h files... So, for a large project, opening, reading, and closing, all the .h files which are included repetitively can become a significant overhead. Even with #IF checks, you still have to open and close them a lot.
We practice this at my source of employment. My boss explained this in a similar way, but I'm sure his explanation was more clear.
How do forward declarations speed up compilations since at some point the object referenced will need to be compiled?
Because include is a preprocessor thing, which means it is done via brute force when parsing the file. Your object will be compiled once (compiler) then linked (linker) as appropriate later.
In C/C++, when you compile, you've got to remember there is a whole chain of tools involved (preprocessor, compiler, linker plus build management tools like make or Visual Studio, etc...)
Good and evil. The battle continues, but now on the battle field of header files. Header files are a necessity and a feature of the language, but they can create a lot of unnecessary overhead if used in a non optimal way, e.g. not using forward declarations etc.
How do forward declarations speed up
compilations since at some point the
object referenced will need to be
compiled?
I can buy the argument for reduced
complexity, but what would a practical
example of this be?
Forward declarations are bad ass. My experience is that a lot of c++ programmers are not aware of the fact that you don't have to include any header file, unless you actually want to use some type, e.g. you need to have the type defined so the compiler understands what you want to do. It's important to try and refrain from including header files in other header files.
Just passing around a pointer from one function to another, only requires a forward declaration:
// someFile.h
class CSomeClass;
void SomeFunctionUsingSomeClass(CSomeClass* foo);
Including someFile.h does not require you to include the header file of CSomeClass, since you are merely passing a pointer to it, not using the class. This means that the compiler only needs to parse one line (class CSomeClass;) instead of an entire header file (that might be chained to other header files etc etc).
This reduces both compile time and link time, and we are talking big optimizations here if you have many headers and many classes.
I've been fighting with my compiler for too long.
Problems with circular includes, redefinitions, "missing ';' before *" and so on.
This seems like the place to get a good answer.
How do I include everything into everything else, and never have to worry about the subtleties of includes ever, ever again?
What combination of #DEFINE, #pragma, #include, or whatever else do I need to do to ensure that data types in the murky depths of my project hierarchy will have no difficulty knowing what anything else is?
This is not a troll post, incase such a concept is entirely unthinkable, nor is it posted in the middle of being angry.
I'm simply curious as to whether or not such a possibility exists. Dealing with spaghetti includes is probably the biggest headache I have to deal with in C++, and getting rid of it would increase my workflow significantly.
Cheers,
Brian
Forward deceleration in the headers and inclusions in the implementation (.c, .cpp, etc).
Good question. I would like to know how to do this, too. Here's some of my tricks:
Figuring the file hirechary structure (which file use which file) to draw a raw concept graph.
Using the following code structure is helpful for prevent RE-DEFINING. Details can be found here.
#ifndef MY_CLASS
#define MY_CLASS
#endif
It means if the file is already included, it will not be included again.
in the beginning of every header file,
#ifndef __MYHEADER_H__
#define __MYHEADER_H__
in the end,
#endif /* __MYHEADER_H__ */
can avoid contain a header file repeatedly.
if you use visual stuido, you can just put
#pragma once
in the beginning of header file.
By the way, you can use some static code check tool to find these kinds of issues, such as lint.
I have a macro definition in header file like this:
// header.h
ARRAY_SZ(a) = ((int) sizeof(a)/sizeof(a[0]));
This is defined in some header file, which includes some more header files.
Now, i need to use this macro in some source file that has no other reason to include header.h or any other header files included in header.h, so should i redefine the macro in my source file or simply include the header file header.h.
Will the latter approach affect the code size/compile time (I think yes), or runtime (i think no)?
Your advice on this!
Include the header file or break it out into a smaller unit and include that in the original header and in your code.
As for code size, unless your headers do something incredibly ill-advised, like declaring variables or defining functions, they should not affect the memory footprint much, if at all. They will affect your compile time to a certain degree as well as polluting your name space.
Including the header in the source file might affect compile time slightly unless you are using pre-compiled headers. It shouldn't affect the code size though. Redefining the macro shouldn't have any effect on compile time or size. It is more of a maintenance and consistency issue though.
should i redefine the macro in my source file or simply include the header file header.h.
Neither. Instead you should clean up the code and break header.h so that one can use ARRAY_SZ() without also getting unrelated stuff.
You ask:
Will the latter approach affect the
code size/compile time (I think yes)
In the case of the specific macro, the answer is "no" to the size, because the sizeof expression can be evaluated at compile time, and therefore "yes" to the time. Neither are likely to be remotely significant.
Unless you're running this on a really limited bit of hardware, or this is called billions and billions of times, you won't notice any difference between the two at either compile time or run time.
Go for whatever seems more readable / maintainable.
Personally, I'd suggest there are better ways of achieving what you're doing there without involving macros (namely inline functions and/or function templates). You have to be careful using your solution because there are a few gotchas you need to keep an eye on.
Including that header and all other headers included into it will increase the compile time. It can affect runtime if there're other definitions that will change how your code compiles - if your code compiles differently because of those defines it will of course run differently. Although the latter is not usual be careful.