I am reading a book on Applied C++.
Include guards will prevent a header file from being included more
than once during the compilation of source file. Your symbol names
should be unique, and we recommend choosing the name based on the name
of the file. For example, our file, cache.h contains this include
guard.
#ifndef _cache_h_
#define _cache_h_
...
#endif // _cache_h_
Lakos describes using redundant include guards to speed up
compilation. See [Lakos96]. For large projects, it takes times to open
each file, only to find that the include guard symbol is already
defined (i.e., the file has already been included). The effects on
compilation time can be dramatic, and Lakos shows a possible 20x
increase in compilation times when only standard include guards are
used.
[Lakos96]: LargeScale C++ software design.
I don't have Lakos96 reference book to refer concept so asking help here.
My questions on above text is
What does author mean by " For large projects, it takes times to open each file, only to find that the include guard symbol is already defined" ?
What does author mean by "when standard include guards are used" ?
Thanks for your time and help.
From C++ Coding Standards (Sutter, Alexandrescu)
Many modern C++ compilers recognize header guards automatically (see
Item 24) and don't even open the same header twice. Some also offer
precompiled headers, which help to ensure that often-used,
seldom-changed headers will not be parsed often
So, I would consider those suggestions outdated (unless you are still using some very dated compiler).
As for your questions:
it means: opening a file which is not needed (since it has been already included; which you will know because the include guard is already defined) is costy; and this might be an issue if you do it a lot of times (which can happen if you have hundreds of files in your project).
as opposed to using non-redundant compile guards.
What is a redundant compile guard?
A naive compiler will reload the file every time it's included. To
avoid that, put RedundantIncludeGuards around the include: header.h
#ifndef HEADER_H_
#define HEADER_H_
// declarations
#endif
foo.c
#ifndef HEADER_H_
#include "header.h"
#endif
read more here. Your reference claims that by doing so you can be as much as 20% faster during compilation than you would be if foo.c were only doing
#include "header.h"
I don't know what Lakos96 says, but I'm going to guess anyway...
A standard include guard is like:
foo.h
#ifndef FOO_H_INCLUDED
#define FOO_H_INCLUDED
....
#endif
A redundant include guard is using the macro when including the file:
bar.c
#ifndef FOO_H_INCLUDED
#include "foo.h"
#endif
That way the second time the foo.h file is included, the compiler will not even search for it in the disk. Hence the speedup: imagine a large project, one single compilation unit may include foo.h 100 times, but only the first one will be parsed. The other 99 times it will be searched for, opened, tokenized, discarded by the pre-compiler and closed.
But note that that was in 1996. Today, GCC, to give a well known example, has specific optimizations that recognize the include guard pattern and makes the redundant include guard, well..., redundant.
Lakos' book is old. It may have been true once, but you should time things on your machine. Many people now disagree with him, e.g.
http://www.allankelly.net/static/writing/overload/IncludeFiles/AnExchangeWithHerbSutter.pdf
or http://c2.com/cgi/wiki?RedundantIncludeGuards
or http://gamearchitect.net/Articles/ExperimentsWithIncludes.html
Herb Sutter, C++ guru and current chair of the ISO C++ standards
committee, argues against external include guards:
"Incidentally, I strongly disagree with Lakos' external include guards
on two grounds:
There's no benefit on most compilers. I admit that I haven't done measurements, as Lakos seems to have done back then, but as far as I
know today's compilers already have smarts to avoid the build time
reread overhead--even MSVC does this optimization (although it
requires you to say "#pragma once"), and it's the weakest compiler in
many ways.
External include guards violate encapsulation because they require many/all callers to know about the internals of the header -- in
particular, the special #define name used as a guard. They're also
fragile--what if you get the name wrong? what if the name changes?"
I think what it refers to is to replicate the include guard outside of the header file, e.g.
#ifndef _cache_h_
#include <cache.h>
#endif
However, if you do this, you'll have to consider that header guards are sometimes changing within a file. And you certainly won't see a 20x improvement in a modern system - unless all your files are on a very remote network drive, possibly - but then you'll have a much better improvement from copying the project files to your local drive!
There was a similar question a while back, regarding "including redundant files" (referring to including header files multiple times), and I built a smallish system with 30 source files, which included <iostream> "unnecessarily", and the overall difference in compile time was 0.3% between including and not including <iostream>. I believe this finding shows the improvement in GCC that "automatically recognises files that produce nothing outside of include guards".
In a large project, there may be many headers - perhaps 100s or even 1000s of files. In the normal case, where include guards are inside each header, the compiler has to check (but see below) the contents of the file to see if it's already been included.
These guards, inside the header, are "standard".
Lakos recommends (for large projects) putting the guards around the #include directive, meaning the header won't even need to be opened if it's already been included.
As far as I know, however, all modern C++ compilers support the #pragma once directive, which coupled with pre-compiled headers means the problem is no longer an issue in most cases.
in larger projects with more people, there may be, for example, one module dealing with time transformation and it's author could chose to use TIME as a guard. Then you'll have another one, dealing with precise timing and it's author, unaware of the first one, may choose TIME too. Now you have a conflict. If they used TIME_TRANSFORMATION and PRECISE_TIMING_MODULE, they'll be ok
Don't know. I would guess it coud mean "when you do it every time, consistently, it becomes your coding standard".
Related
So I recently had a discussion where I work, in which I was questioning the use of a double include guard over a single guard. What I mean by double guard is as follows:
Header file, "header_a.hpp":
#ifndef __HEADER_A_HPP__
#define __HEADER_A_HPP__
...
...
#endif
When including the header file anywhere, either in a header or source file:
#ifndef __HEADER_A_HPP__
#include "header_a.hpp"
#endif
Now I understand that the use of the guard in header files is to prevent multiple inclusion of an already defined header file, it's common and well documented. If the macro is already defined, the entire header file is seen as 'blank' by the compiler and the double inclusion is prevented. Simple enough.
The issue I don't understand is using #ifndef __HEADER_A_HPP__ and #endif around the #include "header_a.hpp". I'm told by the coworker that this adds a second layer of protection to inclusions but I fail to see how that second layer is even useful if the first layer absolutely does the job (or does it?).
The only benefit I can come up with is that it outright stops the linker from bothering to find the file. Is this meant to improve compilation time (which was not mentioned as a benefit), or is there something else at work here that I am not seeing?
I am pretty sure that it is a bad practice to add another include guard like:
#ifndef __HEADER_A_HPP__
#include "header_a.hpp"
#endif
Here are some reasons why:
To avoid double inclusion it is enough to add a usual include guard inside the header file itself. It does the job well. Another include guard in the place of inclusion just messes the code and reduces readability.
It adds unnecessary dependencies. If you change include guard inside the header file you have to change it in all places where the header is included.
It is definitely not the most expensive operation comparing the whole compilation/linkage process so it can hardly reduce the total build time.
Any compiler worth anything already optimizes file-wide include-guards.
The reason for putting include guards in the header file is to prevent the contents of the header from being pulled into a translation unit more than once. That's normal, long-established practice.
The reason for putting redundant include guards in a source file is to avoid having to open the header file that's being included, and back in the olden days that could significantly speed up compilation. These days, opening a file is much faster than it used to be; further, compilers are pretty smart about remembering which files they've already seen, and they understand the include guard idiom, so can figure out on their own that they don't need to open the file again. That's a bit of hand-waving, but the bottom line is that this extra layer isn't needed any more.
EDIT: another factor here is that compiling C++ is far more complicated than compiling C, so it takes far longer, making the time spent opening include files a smaller, less significant part of the time it takes to compile a translation unit.
The only benefit I can come up with is that it outright stops the linker from bothering to find the file.
The linker will not be affected in any way.
It could prevent the pre-processor from bothering to find the file, but if the guard is defined, that means that it has already found the file. I suspect that if the pre-process time is reduced at all, the effect would be quite minimal except in the most pathologically recursively included monstrosity.
It has a downside that if the guard is ever changed (for example due to conflict with another guard), all the conditionals before the include directives must be changed in order for them to work. And if something else uses the previous guard, then the conditionals must be changed for the include directive itself to work correctly.
P.S. __HEADER_A_HPP__ is a symbol that is reserved to the implementation, so it is not something that you may define. Use another name for the guard.
Older compilers on more traditional (mainframe) platforms (we're talking mid-2000s here) did not used to have the optimisation described in other answers, and so it really did used to significantly slow down preprocessing time having to re-read header files that have already been included (bearing in mind in a big, monolithic, enterprise-y project you're going to be including a LOT of header files). As an example, I've seen data that indicates a 26-fold speedup for a file with 256 header files each including the same 256 header files on the VisualAge C++ 6 for AIX compiler (which dates from the mid-2000s). This is a rather extreme example but this sort of speed-up does add up.
However, all modern compilers, even on mainframe platforms such as AIX and Solaris, perform enough optimisation for header inclusion that the difference these days really is negligible. Therefore there is no good reason to have these any more.
This does, however, explain why some companies still hang on to the practice, because relatively recently (at least in C/C++ codebase age terms) it was still worthwhile for very large monolithic projects.
Although there are people arguing against it, in practice '#pragma once' works perfectly and the main compilers (gcc/g++, vc++) support it.
So whatever puristic argumentation people are spreading, it works a lot better:
Fast
No maintenance, no trouble with mysterious non-inclusion because you copied an old flag
Single line with obvious meaning versus cryptic lines spread in file
So simply put:
#pragma once
at the start of the file, and that's it. Optimized, maintainable, and ready to go.
I recently started working on a project where I came across this:
#include <string.h> // includes before include guards
#include "whatever.h"
#ifndef CLASSNAME_H // header guards
#define CLASSNAME_H
// The code
#endif
My question: Considering all (included) header files were written in that same style: Could this lead to problems (cyclic reference, etc.). And: Is there any (good) reason to do this?
Potentially, having #include outside the include guards could lead to circular references, etc. If the other files are properly protected, there isn't an issue. If the other files are written like this one, there could be problems.
No, there isn't a good reason that I know of to write the code with the #include lines outside the include guards.
The include guards should be around the whole contents of the header; I can't think of an exception to this (when header guards are appropriate in the first place — the C header <assert.h> is one which does not have header guards for a good reason).
As long as you don't have circular includes (whatever1.h includes whatever2.h which includes whatever1.h) this should not be a problem, as the code itself is still protected against multiple inclusion.
It will however almost certainly impact compile time (how much depends on the project size) for two reasons:
Modern compilers usually detect "classical" include guards and just ignore any further #includes of that file (just like #pragma once). The structure you are showing prevents that optimization.
Each compilation unit becomes much larger as each file will be included much more often - right before the preprocessor then deletes all the inactive blocks again.
In any case, I can't think of any benefit such a structure would have. Maybe it is the result of some strange historical reasons, like some obscure analysis tool that was used on your codebase in pre-standard C++ time.
I used to use the following code to make sure that the include file is not loaded more than once.
#ifndef _STRING_
#include <string>
#endif
// use std::string here
std::string str;
...
This trick is illustrated in the book "API Design for C++".
Now my co-work told me that this is not necessary in Visual Studio because if the implementation head file of string contains #pragma once, the include guard is not required to improve the compilation speed.
Is that correct?
Quote from original book:
7.2.3 Redundant #include Guards
Another way to reduce the overhead of parsing too many include files is to add redundant preprocessor
guards at the point of inclusion. For example, if you have an include file, bigfile.h, that looks
like this
#ifndef BIGFILE_H
#define BIGFILE_H
// lots and lots of code
#endif
then you might include this file from another header by doing the following:
#ifndef BIGFILE_H
#include "bigfile.h"
#endif
This saves the cost of pointlessly opening and parsing the entire include file if you’ve already
included it.
Usually the term 'include guard' means that this #ifdef,#define,#endif sequence is put around the contents of a particular header file inside this file.
A number of C++ compilers provide the #pragma once statement that guarantees the same behavior externally. But I would discourage using it for sake of portable C/C++ code.
UPDATE (according the OP's edit)
Additionally putting the #ifdef,#endif around the #include statement in another file might prevent the preprocessor from opening the include file itself (and thus reducing compile time and memory usage slightly). I'd expect#pragma once would do this automatically, but can't tell for sure (this might be implementation specific).
You don't ever need to do that because any header file written by a competent developer will have its own guard. You can assume the standard library headers were written by competent engineers, and if you ever find yourself using a third party header without include guards... well, that third party is now highly suspect...
As for writing your own headers, you can use the standard:
#ifndef MY_HEADER_H
#define MY_HEADER_H
// ...code
#endif
Or just use:
#pragma once
Note that this is not standard C or C++, it is a compiler extension. It won't work on every compiler out there, but using it is your decision and depends on your expected use.
Redundant include guards are, by definition "redundant". They do not affect the binaries created through compilation. However, they do have a benefit. Redundant include guards can reduce compile times.
Who cares about compile times? I care. I am just one developer is a project of hundreds of developers with millions of lines of source code in thousands of source files. A complete rebuild of the project takes me 45 minutes. Incremental builds from revision control pulls take me 20+ minutes. As my work depends on this big project, I cannot perform any testing while waiting on this prolonged build. If that build time were cut to under 5 minutes, our company would benefit greatly. Suppose the build time saving was 20 minutes. 1 year * 100 developers * 1 build/day, * 1/3 hour/build * 250 days/year * $50/hr = $416,667 savings per year. Someone should care about that.
For Ed S, I have been using Redundant Include guards for 10 years. Occasionally you will find someone who uses the technique, but most shy from it because it can make ugly-looking code. "#pragma once" surely looks a lot cleaner. Percentage-wise, very few developers continually try to improve their talent by continuing their education and techniques. The redundant #include guards technique is a bit obscure, and its benefits are only realized when someone bothers to do an analysis on large-scale projects. How many develops do you know who go out of their way to buy C++ books on advanced techniques?
Back to the original question about Redundant Include guards vs #pragma once in Visual Studio... According to the Wiki #pragma once, compilers which support "#pragma once" potentially can be more efficient that #include guards as they can analyze file names and path to prevent loading of files which were already loaded. Three compilers were mentioned by name as having this optimization. Conspicuously absent from this list, is Visual Studio. So, we are still left wondering if, in Visual Studio, should redundant #include guards be used, or #pragma once.
For small to medium sized projects, #pragma once is certainly convenient. For large sized projects where compile time become a factor during development, redundant #include guards give a developer greater control over the compilation process. Anyone who is managing or architecting large-scale projects should have Large Scale C++ Design in their library--it talks about and recommends redundant #include guards.
Possibly of greater benefit than redundant include guards is smart usage of #includes. With C++ templates and STL becoming more popular, method implementations are migrating from .cpp files to .h files. Any header dependencies the .cpp implementation would have had, is now necessarily having to migrate to the .h file. This increases compilation time. I have often seen developers stack lots of unnecessary #include's into their header files so they won't have to bother identifying the headers they actually need. This also increases compile time.
The #pragma once is a nicer form of an include guard. If you use it, you don't need the include guard based on #define.
In general, this is a better approach, since it prevents name clashes from being able to break an include guard.
That being said, the include guard should be in the header file, not wrapping the include. Wrapping the include should be completely unnecessary (and will likely confuse other people down the road).
Edit:
I think we are talking about two different things. My question is whether we should use include guard when we use a pre-existing head file that has either #pragma once or #ifndef xxx
In that case, no. If the header has a proper guard, there is no reason to try to avoid including it. This just adds confusion and complexity.
That's not how include guards are used. You don't wrap your #includes in an include guard. A header file should wrap its own contents in an include guard. Whenever you write a file that will likely be included in others, you should do:
#ifndef _SOME_GUARD_
#define _SOME_GUARD_
// Content here
#endif
With Visual Studio's implementation of the C++ library, that might be done by the string header having #pragma once or by checking #ifndef _STRING_.
Does including the same header files multiple times increase the compilation time?
For example, suppose every file in my project uses <iostream> <string> <vector> and <algorithm>. And if I include a lot of files in my source code, then does that increase the compile time?
I always thought that the guard headers served important purpose of avoiding double definitions but as a by product also eliminates double code.
Actually, someone I know proposed some ideas to remove such multiple inclusions. However, I consider them to be completely against the good design practices in c++. But was still wondering what might be the reasons of him to suggest the changes?
Most of these answers are wrong... For modern compilers, there is zero overhead for including the same file multiple times, assuming the header uses the usual "include guard" idiom.
The GCC preprocessor, for example, has special code to recognize the include guard idiom. It will not even open the header file (never mind reading it) for the second and subsequent #include directives.
I am not sure about other compilers, but I would be very surprised if most of them did not implement the same optimization.
Another technique besides precompiled headers is the compiler firewall idiom, explained here:
http://www.gotw.ca/publications/mill04.htm
http://www.gotw.ca/publications/mill05.htm
Every time #include <something.h> occurs in your source file, 'something.h' have to be found along the include path and read. But there is #ifndef _SOMETHING_H_ check, so the content of such something.h would not be compiled.
Thus there is some overhead, but it is really small.
If compile times were an issue, people used to use the optimisation recommended by Praetorian, originally recommened in Large Scale Software Design. However, most modern compilers automatically optimise for this case. For example, see the help from gcc
The best is to use precompiled headers. I do not know which compiler you are using, but most of them have this feature. I suggest you to refer to your compiler-manual on how to achieve this.
It basically collects all headerfiles and compiles it into a object file which then can be used by the linker. That speeds up compiling very much.
Minor Drawback:
You need to have 1 "uberheader" which is included in every compilation-unit (.cpp).
In that uberheader, only include static headers from libraries, not your own. Then the compiler does not need to recompile it very often.
It helps esp. when using header-only libraries such as boost or glm, eigen etc.
HTH
Yes, including the same header multiple times means that the file needs to be opened before the preprocessor guards kick in and prevent multiple definitions. The Mozilla source code uses the following trick to prevent this:
Foo.h
#ifndef FOO_H
#define FOO_H
// whatever
#endif /* FOO_H */
In all files that need to include foo.h
#ifndef FOO_H
#include "foo.h"
#endif
This prevents foo.h from having to be opened multiple times. Of course, this depends on everyone following a particular naming convention for their preprocessor guards.
You can't do this with standard headers, since there is no common naming convention for their preprocessor guards.
EDIT:
After reading your question again, I think you're asking about the same header being included in different source files. What I talked about above does not help with that. Each header file will still have to be opened and included at least once in every translation unit. The only way I know of to prevent this is to use precompiled headers, as #scorcher24 mentioned in his answer. But I'd stay away from this solution, because there is no standard way of generating precompiled headers across compilers, unless the compile times are absolutely prohibitive.
Some compilers, most notably Microsoft's, have a #pragma once directive that you can use to automatically skip an include file once it's already been included. This removes any performance penalty.
http://en.wikipedia.org/wiki/Pragma_once
It can be an issue. As others have said, most modern compilers
handle the case intelligently, and will only re-open the file in
degenerate cases. Most is not all, however, and one of the major
exceptions is Microsoft, which a lot of people do have to support. The
surest solution (if this is really a problem in your environment) is to
use the Lakos convention, putting the include guards around the
#include as well as in the header. This means, of course, a standard
convention for generating the guard names. (For external includes, wrap
them in your own header, which respects your local convention.)
Alternatively, you can use both the guards and #pragma once. The
guards will always work, and most compilers will avoid the extra opens,
and #pragma once will usually avoid the extra opens with Microsoft.
(#pragma once cannot be implemented reliably in complex networked
situation, but as long as all of your files are on your local drive,
it's quite reliable.)
I have inherited C/C++ code base, and in a number of .cpp files the #include directives are wrapped in #ifndef's with the headers internal single include #define.
for example
#ifndef _INC_WINDOWS
#include <windows.h>
#endif
and windows.h looks like
#ifndef _INC_WINDOWS
#define _INC_WINDOWS
...header file stuff....
#endif // _INC_WINDOWS
I assume this was done to speed up the compile/preprocess of the code.
I think it's ugly and a premature optimisation, but as the project has a 5 minute build time from clean, I don't want to makes things worse.
So does the practice add any value or speed things up lots? Is it OK to clean them up?
Update: compiler is MSVC (VS2005) and platform is Win32/WinCE
It's worth knowing that some implementations have #pragma once and/or a header-include-guard detection optimisation, and that in both cases the preprocessor will automatically skip opening, reading, or processing a header file which it has included before.
So on those compilers, including MSVC and GCC, this "optimisation" is pointless, and it should be the header files responsibility to handle multiple inclusion. However, it's possible that this is an optimisation for compilers where #include is very inefficient. Is the code pathologically portable, and <windows.h> refers not to the well-known Win32 header file, but to some user-defined header file of the same name?
It's also possible that the header files don't have multiple-include guards, and that this check is actually essential. In which case I'd suggest changing the headers. The whole point of headers is as a substitute for copy-and-pasting code about the place: it shouldn't take three lines to include a header.
Edit:
Since you say you only care about MSVC, I would either:
do a mass edit, time the build just to make sure the previous programmer doesn't know something I don't. Maybe add #pragma once if it helps. Use precompiled headers if all this really is slowing things down.
Ignore it, but don't use the guards for new files or for new #includes added to old files.
Depending on whether I had more important things to worry about. This is a classic Friday-afternoon job, I wouldn't spend potentially-productive time on it ;-)
If a file is included, then that whole file has to be read, and even the overhead of opening/closing the file might be significant. By putting the guarding directives around the include statement, it never has to be opened. As always with these questions, the correct answer is: try taking out the ifndef/endif guards around the include directives and get your stopwatch...