I recently started working on a project where I came across this:
#include <string.h> // includes before include guards
#include "whatever.h"
#ifndef CLASSNAME_H // header guards
#define CLASSNAME_H
// The code
#endif
My question: Considering all (included) header files were written in that same style: Could this lead to problems (cyclic reference, etc.). And: Is there any (good) reason to do this?
Potentially, having #include outside the include guards could lead to circular references, etc. If the other files are properly protected, there isn't an issue. If the other files are written like this one, there could be problems.
No, there isn't a good reason that I know of to write the code with the #include lines outside the include guards.
The include guards should be around the whole contents of the header; I can't think of an exception to this (when header guards are appropriate in the first place — the C header <assert.h> is one which does not have header guards for a good reason).
As long as you don't have circular includes (whatever1.h includes whatever2.h which includes whatever1.h) this should not be a problem, as the code itself is still protected against multiple inclusion.
It will however almost certainly impact compile time (how much depends on the project size) for two reasons:
Modern compilers usually detect "classical" include guards and just ignore any further #includes of that file (just like #pragma once). The structure you are showing prevents that optimization.
Each compilation unit becomes much larger as each file will be included much more often - right before the preprocessor then deletes all the inactive blocks again.
In any case, I can't think of any benefit such a structure would have. Maybe it is the result of some strange historical reasons, like some obscure analysis tool that was used on your codebase in pre-standard C++ time.
Related
The question is about including an unnecessary header to avoid calling it multiple times in sub-files.
Here's the scenario, I have a few files:
srlogger.h
srinterface.h
srinterface.cc
#include <srinterface.h>
#include "srlogger.h"
srbinhttp.h
#include "srinterface.h"
srbinhttp.cc
#include <srbinhttp.h>
#include "srlogger.h"
srsocket.h
#include "srinterface.h"
srsocket.cc
#include <srsocket.h>
#include "srlogger.h"
srhttp.h
#include "srinterface.h"
srhttp.cc
#include <srhttp.h>
#include "srlogger.h"
Now, what I want to do is to remove the #include "srlogger.h" from all .cc files shown, and instead include it to the srinterface.h file as:
srinterface.h
#include "srlogger.h"
Since all of the .cc respective header files include the srinterface.h the srlogger.h would be covered.
Now, why would this be bad?
Please do not say that you should only include the necessary headers to compile and nothing extra, this is not enough explanation.
I want to know in real examples why this would be bad.
Oh if someone removes the #include "srlogger.h" from the srinterface.h it would break the other file, this is a weak explanation. A comment after the include could easily warn other people.
What interests me the most is if it will affect the compilation in a bad way, will the size of the objects or executable files change because of that, does it affect performance.
Or you have a really good explanation why this is bad.
PS.: If you are curious why would I want to do that, is because I was mapping the dependencies between the files, and doing such a thing I can create a graphical visualization between all the dependencies, making it easier to understand how the pieces of the puzzle fit together. Transferring sub-headers to common header in the higher hierarchy header creates a more organized structure between the all files.
The potential negative effect is one of compile time. If someone includes your header who doesn't need the header it drags in, the compile time of that compilation unit will increase for no good reason.
For toy projects or small projects (of a few hundred files) that compile in a few seconds, this makes no real difference.
But when you work on projects that are millions of lines of code spread across hundreds of thousands of files, that already take a significant fraction of an hour to compile, and you add an include to a header that's included by 12000 other files because you could not be bothered to explicitly add it to the 120 files that actually needed it (but just happened to include the common header) - then you are not going to be popular, since you just increased everyones average build time by several minutes.
There is also the risk (in bad code bases) that the header you (unnessesarily) drag in to other files may redefine stuff that breaks things for that source file that didn't even need the other header in the first place.
For the above reasons, I believe that headers should only include what they really need themselves and cannot forward declare. Implementation files should only include headers they really need (and include their own headers first to make sure they are self contained).
Hope that answers your question.
"The question is about including an unnecessary header to avoid calling it multiple times in sub-files."
Include guards will solve the feasible part of this problem of including multiple headers in the same file. Include guards will cut down the unnecessary includes to a certain extent. See the link below:
C++ #include guards
An include guard is made by adding the following to your header file:
//at the very top of the header
#ifndef NAMEOFHEADER_H
#define NAMEOFHEADER_H
// header info
//at the very last line of the header
#endif
This will keep you from accumulating the same header file multiple times in another .h or .cpp file.
And as was stated in the comment below, even if every header has include guards you can still end up with information not even needed for your file being defined by the compiler during its preprocessor directives. This is bound to happen with the chain of includes across multiple files.
I am reading a book on Applied C++.
Include guards will prevent a header file from being included more
than once during the compilation of source file. Your symbol names
should be unique, and we recommend choosing the name based on the name
of the file. For example, our file, cache.h contains this include
guard.
#ifndef _cache_h_
#define _cache_h_
...
#endif // _cache_h_
Lakos describes using redundant include guards to speed up
compilation. See [Lakos96]. For large projects, it takes times to open
each file, only to find that the include guard symbol is already
defined (i.e., the file has already been included). The effects on
compilation time can be dramatic, and Lakos shows a possible 20x
increase in compilation times when only standard include guards are
used.
[Lakos96]: LargeScale C++ software design.
I don't have Lakos96 reference book to refer concept so asking help here.
My questions on above text is
What does author mean by " For large projects, it takes times to open each file, only to find that the include guard symbol is already defined" ?
What does author mean by "when standard include guards are used" ?
Thanks for your time and help.
From C++ Coding Standards (Sutter, Alexandrescu)
Many modern C++ compilers recognize header guards automatically (see
Item 24) and don't even open the same header twice. Some also offer
precompiled headers, which help to ensure that often-used,
seldom-changed headers will not be parsed often
So, I would consider those suggestions outdated (unless you are still using some very dated compiler).
As for your questions:
it means: opening a file which is not needed (since it has been already included; which you will know because the include guard is already defined) is costy; and this might be an issue if you do it a lot of times (which can happen if you have hundreds of files in your project).
as opposed to using non-redundant compile guards.
What is a redundant compile guard?
A naive compiler will reload the file every time it's included. To
avoid that, put RedundantIncludeGuards around the include: header.h
#ifndef HEADER_H_
#define HEADER_H_
// declarations
#endif
foo.c
#ifndef HEADER_H_
#include "header.h"
#endif
read more here. Your reference claims that by doing so you can be as much as 20% faster during compilation than you would be if foo.c were only doing
#include "header.h"
I don't know what Lakos96 says, but I'm going to guess anyway...
A standard include guard is like:
foo.h
#ifndef FOO_H_INCLUDED
#define FOO_H_INCLUDED
....
#endif
A redundant include guard is using the macro when including the file:
bar.c
#ifndef FOO_H_INCLUDED
#include "foo.h"
#endif
That way the second time the foo.h file is included, the compiler will not even search for it in the disk. Hence the speedup: imagine a large project, one single compilation unit may include foo.h 100 times, but only the first one will be parsed. The other 99 times it will be searched for, opened, tokenized, discarded by the pre-compiler and closed.
But note that that was in 1996. Today, GCC, to give a well known example, has specific optimizations that recognize the include guard pattern and makes the redundant include guard, well..., redundant.
Lakos' book is old. It may have been true once, but you should time things on your machine. Many people now disagree with him, e.g.
http://www.allankelly.net/static/writing/overload/IncludeFiles/AnExchangeWithHerbSutter.pdf
or http://c2.com/cgi/wiki?RedundantIncludeGuards
or http://gamearchitect.net/Articles/ExperimentsWithIncludes.html
Herb Sutter, C++ guru and current chair of the ISO C++ standards
committee, argues against external include guards:
"Incidentally, I strongly disagree with Lakos' external include guards
on two grounds:
There's no benefit on most compilers. I admit that I haven't done measurements, as Lakos seems to have done back then, but as far as I
know today's compilers already have smarts to avoid the build time
reread overhead--even MSVC does this optimization (although it
requires you to say "#pragma once"), and it's the weakest compiler in
many ways.
External include guards violate encapsulation because they require many/all callers to know about the internals of the header -- in
particular, the special #define name used as a guard. They're also
fragile--what if you get the name wrong? what if the name changes?"
I think what it refers to is to replicate the include guard outside of the header file, e.g.
#ifndef _cache_h_
#include <cache.h>
#endif
However, if you do this, you'll have to consider that header guards are sometimes changing within a file. And you certainly won't see a 20x improvement in a modern system - unless all your files are on a very remote network drive, possibly - but then you'll have a much better improvement from copying the project files to your local drive!
There was a similar question a while back, regarding "including redundant files" (referring to including header files multiple times), and I built a smallish system with 30 source files, which included <iostream> "unnecessarily", and the overall difference in compile time was 0.3% between including and not including <iostream>. I believe this finding shows the improvement in GCC that "automatically recognises files that produce nothing outside of include guards".
In a large project, there may be many headers - perhaps 100s or even 1000s of files. In the normal case, where include guards are inside each header, the compiler has to check (but see below) the contents of the file to see if it's already been included.
These guards, inside the header, are "standard".
Lakos recommends (for large projects) putting the guards around the #include directive, meaning the header won't even need to be opened if it's already been included.
As far as I know, however, all modern C++ compilers support the #pragma once directive, which coupled with pre-compiled headers means the problem is no longer an issue in most cases.
in larger projects with more people, there may be, for example, one module dealing with time transformation and it's author could chose to use TIME as a guard. Then you'll have another one, dealing with precise timing and it's author, unaware of the first one, may choose TIME too. Now you have a conflict. If they used TIME_TRANSFORMATION and PRECISE_TIMING_MODULE, they'll be ok
Don't know. I would guess it coud mean "when you do it every time, consistently, it becomes your coding standard".
I used to use the following code to make sure that the include file is not loaded more than once.
#ifndef _STRING_
#include <string>
#endif
// use std::string here
std::string str;
...
This trick is illustrated in the book "API Design for C++".
Now my co-work told me that this is not necessary in Visual Studio because if the implementation head file of string contains #pragma once, the include guard is not required to improve the compilation speed.
Is that correct?
Quote from original book:
7.2.3 Redundant #include Guards
Another way to reduce the overhead of parsing too many include files is to add redundant preprocessor
guards at the point of inclusion. For example, if you have an include file, bigfile.h, that looks
like this
#ifndef BIGFILE_H
#define BIGFILE_H
// lots and lots of code
#endif
then you might include this file from another header by doing the following:
#ifndef BIGFILE_H
#include "bigfile.h"
#endif
This saves the cost of pointlessly opening and parsing the entire include file if you’ve already
included it.
Usually the term 'include guard' means that this #ifdef,#define,#endif sequence is put around the contents of a particular header file inside this file.
A number of C++ compilers provide the #pragma once statement that guarantees the same behavior externally. But I would discourage using it for sake of portable C/C++ code.
UPDATE (according the OP's edit)
Additionally putting the #ifdef,#endif around the #include statement in another file might prevent the preprocessor from opening the include file itself (and thus reducing compile time and memory usage slightly). I'd expect#pragma once would do this automatically, but can't tell for sure (this might be implementation specific).
You don't ever need to do that because any header file written by a competent developer will have its own guard. You can assume the standard library headers were written by competent engineers, and if you ever find yourself using a third party header without include guards... well, that third party is now highly suspect...
As for writing your own headers, you can use the standard:
#ifndef MY_HEADER_H
#define MY_HEADER_H
// ...code
#endif
Or just use:
#pragma once
Note that this is not standard C or C++, it is a compiler extension. It won't work on every compiler out there, but using it is your decision and depends on your expected use.
Redundant include guards are, by definition "redundant". They do not affect the binaries created through compilation. However, they do have a benefit. Redundant include guards can reduce compile times.
Who cares about compile times? I care. I am just one developer is a project of hundreds of developers with millions of lines of source code in thousands of source files. A complete rebuild of the project takes me 45 minutes. Incremental builds from revision control pulls take me 20+ minutes. As my work depends on this big project, I cannot perform any testing while waiting on this prolonged build. If that build time were cut to under 5 minutes, our company would benefit greatly. Suppose the build time saving was 20 minutes. 1 year * 100 developers * 1 build/day, * 1/3 hour/build * 250 days/year * $50/hr = $416,667 savings per year. Someone should care about that.
For Ed S, I have been using Redundant Include guards for 10 years. Occasionally you will find someone who uses the technique, but most shy from it because it can make ugly-looking code. "#pragma once" surely looks a lot cleaner. Percentage-wise, very few developers continually try to improve their talent by continuing their education and techniques. The redundant #include guards technique is a bit obscure, and its benefits are only realized when someone bothers to do an analysis on large-scale projects. How many develops do you know who go out of their way to buy C++ books on advanced techniques?
Back to the original question about Redundant Include guards vs #pragma once in Visual Studio... According to the Wiki #pragma once, compilers which support "#pragma once" potentially can be more efficient that #include guards as they can analyze file names and path to prevent loading of files which were already loaded. Three compilers were mentioned by name as having this optimization. Conspicuously absent from this list, is Visual Studio. So, we are still left wondering if, in Visual Studio, should redundant #include guards be used, or #pragma once.
For small to medium sized projects, #pragma once is certainly convenient. For large sized projects where compile time become a factor during development, redundant #include guards give a developer greater control over the compilation process. Anyone who is managing or architecting large-scale projects should have Large Scale C++ Design in their library--it talks about and recommends redundant #include guards.
Possibly of greater benefit than redundant include guards is smart usage of #includes. With C++ templates and STL becoming more popular, method implementations are migrating from .cpp files to .h files. Any header dependencies the .cpp implementation would have had, is now necessarily having to migrate to the .h file. This increases compilation time. I have often seen developers stack lots of unnecessary #include's into their header files so they won't have to bother identifying the headers they actually need. This also increases compile time.
The #pragma once is a nicer form of an include guard. If you use it, you don't need the include guard based on #define.
In general, this is a better approach, since it prevents name clashes from being able to break an include guard.
That being said, the include guard should be in the header file, not wrapping the include. Wrapping the include should be completely unnecessary (and will likely confuse other people down the road).
Edit:
I think we are talking about two different things. My question is whether we should use include guard when we use a pre-existing head file that has either #pragma once or #ifndef xxx
In that case, no. If the header has a proper guard, there is no reason to try to avoid including it. This just adds confusion and complexity.
That's not how include guards are used. You don't wrap your #includes in an include guard. A header file should wrap its own contents in an include guard. Whenever you write a file that will likely be included in others, you should do:
#ifndef _SOME_GUARD_
#define _SOME_GUARD_
// Content here
#endif
With Visual Studio's implementation of the C++ library, that might be done by the string header having #pragma once or by checking #ifndef _STRING_.
Does including the same header files multiple times increase the compilation time?
For example, suppose every file in my project uses <iostream> <string> <vector> and <algorithm>. And if I include a lot of files in my source code, then does that increase the compile time?
I always thought that the guard headers served important purpose of avoiding double definitions but as a by product also eliminates double code.
Actually, someone I know proposed some ideas to remove such multiple inclusions. However, I consider them to be completely against the good design practices in c++. But was still wondering what might be the reasons of him to suggest the changes?
Most of these answers are wrong... For modern compilers, there is zero overhead for including the same file multiple times, assuming the header uses the usual "include guard" idiom.
The GCC preprocessor, for example, has special code to recognize the include guard idiom. It will not even open the header file (never mind reading it) for the second and subsequent #include directives.
I am not sure about other compilers, but I would be very surprised if most of them did not implement the same optimization.
Another technique besides precompiled headers is the compiler firewall idiom, explained here:
http://www.gotw.ca/publications/mill04.htm
http://www.gotw.ca/publications/mill05.htm
Every time #include <something.h> occurs in your source file, 'something.h' have to be found along the include path and read. But there is #ifndef _SOMETHING_H_ check, so the content of such something.h would not be compiled.
Thus there is some overhead, but it is really small.
If compile times were an issue, people used to use the optimisation recommended by Praetorian, originally recommened in Large Scale Software Design. However, most modern compilers automatically optimise for this case. For example, see the help from gcc
The best is to use precompiled headers. I do not know which compiler you are using, but most of them have this feature. I suggest you to refer to your compiler-manual on how to achieve this.
It basically collects all headerfiles and compiles it into a object file which then can be used by the linker. That speeds up compiling very much.
Minor Drawback:
You need to have 1 "uberheader" which is included in every compilation-unit (.cpp).
In that uberheader, only include static headers from libraries, not your own. Then the compiler does not need to recompile it very often.
It helps esp. when using header-only libraries such as boost or glm, eigen etc.
HTH
Yes, including the same header multiple times means that the file needs to be opened before the preprocessor guards kick in and prevent multiple definitions. The Mozilla source code uses the following trick to prevent this:
Foo.h
#ifndef FOO_H
#define FOO_H
// whatever
#endif /* FOO_H */
In all files that need to include foo.h
#ifndef FOO_H
#include "foo.h"
#endif
This prevents foo.h from having to be opened multiple times. Of course, this depends on everyone following a particular naming convention for their preprocessor guards.
You can't do this with standard headers, since there is no common naming convention for their preprocessor guards.
EDIT:
After reading your question again, I think you're asking about the same header being included in different source files. What I talked about above does not help with that. Each header file will still have to be opened and included at least once in every translation unit. The only way I know of to prevent this is to use precompiled headers, as #scorcher24 mentioned in his answer. But I'd stay away from this solution, because there is no standard way of generating precompiled headers across compilers, unless the compile times are absolutely prohibitive.
Some compilers, most notably Microsoft's, have a #pragma once directive that you can use to automatically skip an include file once it's already been included. This removes any performance penalty.
http://en.wikipedia.org/wiki/Pragma_once
It can be an issue. As others have said, most modern compilers
handle the case intelligently, and will only re-open the file in
degenerate cases. Most is not all, however, and one of the major
exceptions is Microsoft, which a lot of people do have to support. The
surest solution (if this is really a problem in your environment) is to
use the Lakos convention, putting the include guards around the
#include as well as in the header. This means, of course, a standard
convention for generating the guard names. (For external includes, wrap
them in your own header, which respects your local convention.)
Alternatively, you can use both the guards and #pragma once. The
guards will always work, and most compilers will avoid the extra opens,
and #pragma once will usually avoid the extra opens with Microsoft.
(#pragma once cannot be implemented reliably in complex networked
situation, but as long as all of your files are on your local drive,
it's quite reliable.)
I have inherited C/C++ code base, and in a number of .cpp files the #include directives are wrapped in #ifndef's with the headers internal single include #define.
for example
#ifndef _INC_WINDOWS
#include <windows.h>
#endif
and windows.h looks like
#ifndef _INC_WINDOWS
#define _INC_WINDOWS
...header file stuff....
#endif // _INC_WINDOWS
I assume this was done to speed up the compile/preprocess of the code.
I think it's ugly and a premature optimisation, but as the project has a 5 minute build time from clean, I don't want to makes things worse.
So does the practice add any value or speed things up lots? Is it OK to clean them up?
Update: compiler is MSVC (VS2005) and platform is Win32/WinCE
It's worth knowing that some implementations have #pragma once and/or a header-include-guard detection optimisation, and that in both cases the preprocessor will automatically skip opening, reading, or processing a header file which it has included before.
So on those compilers, including MSVC and GCC, this "optimisation" is pointless, and it should be the header files responsibility to handle multiple inclusion. However, it's possible that this is an optimisation for compilers where #include is very inefficient. Is the code pathologically portable, and <windows.h> refers not to the well-known Win32 header file, but to some user-defined header file of the same name?
It's also possible that the header files don't have multiple-include guards, and that this check is actually essential. In which case I'd suggest changing the headers. The whole point of headers is as a substitute for copy-and-pasting code about the place: it shouldn't take three lines to include a header.
Edit:
Since you say you only care about MSVC, I would either:
do a mass edit, time the build just to make sure the previous programmer doesn't know something I don't. Maybe add #pragma once if it helps. Use precompiled headers if all this really is slowing things down.
Ignore it, but don't use the guards for new files or for new #includes added to old files.
Depending on whether I had more important things to worry about. This is a classic Friday-afternoon job, I wouldn't spend potentially-productive time on it ;-)
If a file is included, then that whole file has to be read, and even the overhead of opening/closing the file might be significant. By putting the guarding directives around the include statement, it never has to be opened. As always with these questions, the correct answer is: try taking out the ifndef/endif guards around the include directives and get your stopwatch...