Why would using a precompiled header cause a build to be slower? - c++

Our solution contains over 100 projects, over 8000 cpp files and over 10'000 header files.
I'm trying to improve our build times.
One of the projects in the solution contains just 5 cpp files, and takes about 10 seconds to compile. The header files were initially included in the cpp files, but in preparation for switching on precompiled headers, I moved the includes into a single pch.h file.
Each cpp file now includes the pch.h file.
This in itself has not made any discernible change to the compile time - it's still about 10 seconds.
Now when I tell the project to actually use the pch file as a precompiled header, it takes 17 seconds to compile the project.
Why would precompiling the included headers make the project take longer to build than when the file is just #included by each individual cpp file?
More info.
We use a technique called "lumping" - (individual cpp files are not compiled individually - they are each #included into a single project-wide cpp file, and that is the only cpp file which is compiled).
For what it's worth, thanks to spaghetti code, according to "Show Includes" the dozen or so included files in the pch file cause around 3000(!) files to be included. Obviously, this needs fixing!
The precompiled header file is about 130Mb when compiled.
If we switch off lumping, the single project build (not the whole solution) takes 45 seconds. If we then switch on precompiled headers, the build time improves.
I'm probably missing the obvious, but why when lumping is switched on, does switching on precompiled headers slow the build down?

What PCH does is do a "preprocess"/"precompile" stage on common headers included by multiple source files. This helps because repeated "preprocessing"/"precompiling" is avoided, and the compiler loads its previous state for each source file.
If you have just one big source file, this "preprocessing"/"procompiling" also needs to happen, but in total, only once. So then saving and loading the PCH file introduces overhead without taking any away (because there is no repetition whatsoever).
I use the term "preprocess"/"precompile" here because PCH is implemented wildly differently depending on you compiler, and might lean more towards one or the other.
Now, unless you make use of a heavy template library like Boost throughout much of your code, it often is enough to clean up include dependencies to speed up compile time, often by quite a significant factor. But that requires maintenance.

Related

Single header file with all the necessary #include statements

I am currently working on program with a lot of source files. Sometimes it is difficult to keep track of what libraries I have already #included. Theoretically, I could make a single header file called Headers.h that just contains all the #include statements I need, then make all other header files #include "Headers.h".
Why is this a good/bad idea?
Pros:
Slightly less maintenance as you don't have to keep track of which of your files are including headers from which libraries or other compoenents.
Cons:
Definitions in included files might conflict with each other. Especially in C where you don't have namespaces (you tagged with C and C++)
Macros in particular can cause hard to debug problems, where a macro definition unexpectedly conflicts with some name in your file or one of the other included files
Depending on which compiler you use, compilation times might blow out. If using a compiler that pre-compiles headers it might actually reduce compilation time, but if not the opposite will happen
You will often unnecessarily trigger rebuilds of files. If you have your build system set up correctly, then each source file will get rebuilt if any of the included files gets modified. If you always include all headers in your project, then a change to any of your headers will force recompilation of all your source files. Not likely to be an issue for system headers but it will be if you include your own headers in the master file as well.
On the whole I would not recommend that approach. The last con listed above it particularly important.
Best practice would be to include only headers that are needed for the code in each file.
In complement of Harmic's answer, indeed the main issue is the build system (most builders work on file timestamp, not on file contents. omake is a notable exception).
Notice that if you only care about many dependencies, GNU make can be used with autodependencies, together with -M* options passed to GCC (i.e. to g++ and actually to the preprocessor).
However, many libraries are offering to their user a single header (e.g. <gtk/gtk.h>)
Also, a single header file is more friendly to precompiled headers technology. In particular, GCC wants a single header for precompilation.
See also ccache.
Tracking all the required includes would be more difficult as they are abstracted from their c source files and not really supporting modularisation pus all the cons from #harmic

Including all header files in application

I was recently looking through the source code of a C++ application and saw that each class did not #include its needed components, but instead #include'd a "Precompiled.h" header. In this Precompiled header was an inclusion of almost every header in the application (not all of them, it was clear that the length and order of the list was deliberate). Essentially, this would mean that every class had an inclusion of every other class in the application.
Is this wise? Why or why not?
Usually if you write an application, you should only include header files which are really needed in cpp files. If you got a really big application, you should use forward declaration in the header and include necessary files in the cpp file. With that, changes in code only affects a minimum on cpp files, so the compiler had only to compile what really has changed.
The situation can totally flip, when it comes to libraries or code which does not change very often. The filename "Precompiled.h" is already a hint. The compiler can precompile the headers to a special object file, often called PCH file. With that, the compiler has not to resolve every include on every compile time. On heavy nested includes, this has high impact on compile speed, because instead of many files to load and parse, there is only one preparsed file. To archive that you have to declare one or more headers as a kind of center file for building a precompiled header. How you do that differs between different compilers.
For example Visual studio uses the header file "stdafx.h" as the center of the precompilation of header files. Because of that, only header files should include there which are not altered very often. Also the file had to be included first in every cpp file. That is because the compiler can not detect any more if a include file which is included before may have influence to the precompiled file. To avoid that, includes before the precompiled includes are not allowed.
Back to your question. Including every file in one header file to use it as precompiled header makes no sense at all, as it conteract the meaning of a precompiled header file.
It is a very bad idea.
For a .cpp file only include the minimum number of #include files.
Thereby when one of them changes the make (or moral equilivant) will not require the whole lot to be recompiled.
Saves lots of time during development.
PS Use forward declarations in preference to #include

How does precompiled header reduce compile time

I've been using precompiled header for a while and been told (and saw) how they can reduce compile time. But I would really like to know what is going on (under the hood) so it can make my compilation faster.
Because from what I know, adding unused include in a .cpp can slower your compile time, and a header file can contain a lot of unused header to a .cpp.
So how does a precompiled header make my compilation faster?
From http://gamesfromwithin.com/the-care-and-feeding-of-pre-compiled-headers Thank you (#Pablo)
A C++ compiler operates on one compilation unit (cpp file) at the
time. For each file, it applies the pre-preprocessor (which takes care
of doing all the includes and “baking” them into the cpp file itself),
and then it compiles the module itself. Move on to the next cpp file,
rinse and repeat. Clearly, if several files include the same set of
expensive header files (large and/or including many other header files
in turn), the compiler will be doing a lot of duplicated effort.
The simplest way to think of pre-compiled headers is as a cache for
header files. The compiler can analyze a set of headers once, compile
them, and then have the results ready for any module that needs them.
Basically, a header file is compiled once for each translation unit (.cpp file) by which it is included. Using a pre-compiled header header saves on time used to compile an include file over and over again. This is really beneficial when the header file to be pre-compiled is very large (or indirectly includes many other header files).
Many years ago I had access to a C compiler that printed out the number of lines it processed (Watcom C version 6 or so). Compiling files with less than 100 lines of C code would display counts of 5,000 or even 10,000 lines. All of which were #included. In other words #included code completely dominates compilation time. So anything you can do to reduce that is going to be beneficial. You can see for yourself with compilers that allow you to disable preprocessing: compare the times for complete system builds with and without it.
I think the "precompiled" says something about how it makes compilation faster. You can read about the basic concept here I think:
http://en.wikipedia.org/wiki/Precompiled_header

C++ Single Header File Structure

I want to speed up the build time of my c++ project, and I am wondering if my current structure may cause unnecessary recompilations.
I have *.cc and corresponding *.h files, but all my *.cc files include a single header file which is main.h.
In main.h, I include everything necessary and extern global variables and declare the functions I use. Basically, I'm not using any namespaces.
Is this a bad design that could cause unnecessary recompiles and slow builds?
It depends. If main.h is seldom modified, you could use precompiled headers, which will greatly improve compilation time.
On the other hand, if main.h is regularly used, it's probably not a good design.
An additional problem introduced by putting everything in one include file is that it doesn't really promote structure in your application. In well-designed applications you often have a layered structure. By putting everything in one include file, you obfuscate the structure in your application. This may work for a small application, but if your application grows, you will end up one day with a complete spaghetti, where everything depends on everything else.
Try to split the include file in multiple parts. Typically you will have one .cpp and one .h file per class. Try to use forward declarations as much as possible in your include file, and only include (in .h and .cpp) what's really needed.
That design will definitely lead to slow build time. What make files and IDEs do when you start a build is they check which source (cc) files have been modified since the last time you compiled. It also checks whether any files that a source file depends on have been modified. A source file depends on all the header files it includes, and all the header files those header files include, etc. If it detects any modifications then it recompiles that source file.
Since your set up means that each source files includes every single header file, any time you modify even a single header file you need to recompile every source file.
You'll definitely want to try and separate things a bit more and get rid of your main.h file. Usually people try and minimize the number of header files included in a header file and prefer to keep the includes in source files, by the way.

Is there a way to use pre-compiled headers in VC++ without requiring stdafx.h?

I've got a bunch of legacy code that I need to write unit tests for. It uses pre-compiled headers everywhere so almost all .cpp files have a dependecy on stdafx.h which is making it difficult to break dependencies in order to write tests.
My first instinct is to remove all these stdafx.h files which, for the most part, contain #include directives and place those #includes directly in the source files as needed.
This would make it necessary to turn off pre-compiled headers since they are dependent on having a file like stdafx.h to determine where the pre-compiled headers stop.
Is there a way to keep pre-compiled headers without the stdafx.h dependencies? Is there a better way to approach this problem?
Yes, there is a better way.
The problem, IMHO, with the 'wizard style' of precompiled headers is that they encourage unrequired coupling and make reusing code harder than it should be. Also, code that's been written with the 'just stick everything in stdafx.h' style is prone to be a pain to maintain as changing anything in any header file is likely to cause the whole codebase to recompile every time. This can make simple refactoring take forever as each change and recompile cycle takes far longer than it should.
A better way, again IMHO, is to use #pragma hdrstop and /Yc and /Yu. This enables you to easily set up build configurations that DO use precompiled headers and also build configurations that do not use precompiled headers. The files that use precompiled headers don't have a direct dependency on the precompiled header itself in the source file which enables them to be build with or without the precompiled header. The project file determines what source file builds the precompiled header and the #pragma hdrstop line in each source file determines which includes are taken from the precompiled header (if used) and which are taken directly from the source file... This means that when doing maintenance you would use the configuration that doesn't use precompiled headers and only the code that you need to rebuild after a header file change will rebuild. When doing full builds you can use the precompiled header configurations to speed up the compilation process. Another good thing about having the non-precompiled header build option is that it makes sure that your cpp files only include what they need and include everything that they need (something that is hard if you use the 'wizard style' of precompiled header.
I've written a bit about how this works here: http://www.lenholgate.com/blog/2004/07/fi-stlport-precompiled-headers-warning-level-4-and-pragma-hdrstop.html (ignore the stuff about /FI) and I have some example projects that build with the #pragma hdrstop and /Yc /Yu method here: http://www.lenholgate.com/blog/2008/04/practical-testing-16---fixing-a-timeout-bug.html .
Of course, getting from the 'wizard style' precompiled header usage to a more controlled style is often non-trivial...
When you normally use precompiled headers, "stdafx.h" serves 2 purposes. It defines a set of stable, common include files. Also in each .cpp file, it serves as a marker as where the precompiled headers end.
Sounds like what you want to do is:
Leave precompiled header turned on.
Leave the "stdafx.h" include in each .cpp file.
Empty out the includes from "stdafx.h".
For each .cpp file, figure out which includes were needed from the old "stdafx.h". Add these before the #include "stdafx.h" in each .cpp file.
So now you have the minimal set of dependancies, and you still are using precompiled headers. The loss is that you are not precompiling your common set of headers only once. This would be a big hit for a full rebuild. For development mode, where you are only recompiling a few files at a time, it would be less of a hit.
No, there is probably NOT a better way.
However, for a given individual .cpp file, you might decide that you don't need the precompiled header. You could modify the settings for that one .cpp file and remove the stdafx.h line.
(Actually, though, I don't how the pre-compiled header scheme is interferring with the writing of your unit tests).
No. pre-compiled headers relies on a single header included by all sources compiled this way.
you can specify for a single source (or all) not to use pre-compiled headers at all, but that's not what you want.
In the past, Borland C++ compiler did pre-compilation without a specific header. however, if two sources files included the same headers but at different order, they were compiled separately, since, indeed, the order of header files in C++ can matter...
Thus it means that the borland pre-compiled headers did save time only if you very rigidly included sources in the same order, or had a single include file included (first) by all other files... - sounds familiar ?!?!
Yes. The "stdafx.h/stdafx.pch" name is just convention. You can give each .cpp its own precompiled header. This would probably be easiest to achieve by a small script to edit the XML in your .vcproj. Downside: you end up with a large stack of precompiled headers, and they're not shared between TU's.
Possible, but smart? I can't say for sure.
My advice is - don't remove precompiled headers unless you want to make your builds painfully slow. You basically have three options here:
Get rid of precompiled headers (not recommended)
Create a separate library for the legacy code; that way you can build it separately.
Use multiple precompiled headers within a single project. You can select individual C++ files in your Solution Explorer and tell them which precomiled header to use. You would also need to setup your OtherStdAfx.h/cpp to generate a precompiled header.
Pre-compiled headers are predicated on the idea that everything will include the same set of stuff. If you want to make use of pre-compiled headers then you have to live with the dependencies that this implies. It comes down to a trade-off of the dependencies vs the build speed. If you can build in a reasonable time with the pre-compiled headers turned off then by all means do it.
Another thing to consider is that you can have one pch per library. So you may be able to split up your code into smaller libraries and have each of them have a tighter set of dependencies.
I only use pre-compiled headers for the code that needs to include the afx___ stuff - usually just UI, which I don't unit-test. UI code handles UI and calls functions that do have unit-tests (though most don't currently due to the app being legacy).
For the bulk of the code I don't use pre-compiled headers.
G.
Precompiled headers can save a lot of time when rebuilding a project, but if a precompiled header changes, every source file depending on the header will be recompiled, whether the change affects it or not. Fortunately, precompiled headers are used to compile, not link; every source file doesn't have to use the same pre-compiled header.
pch1.h:
#include <bigHeader1.h>
#include ...
pch1.cpp:
#include "pch1.h"
source1.cpp:
#include "pch1.h"
[code]
pch2.h:
#include <bigHeader2.h>
#include ...
pch2.cpp:
#include "pch2.h"
source2.cpp
#include "pch2.h"
[code]
Select pch1.cpp, right click, Properties, Configuration Properties, C/C++, Precompiled Headers.
Precompiled Header : Create(/Yc)
Precompiled Header File: pch1.h
Precompiled Header Output File: $(intDir)pch1.pch
Select source1.cpp
Precompiled Header : Use(/Yu)
Precompiled Header File: pch1.h
Precompiled Header Output File: $(intDir)pch1.pch (I don't think this matters for /Yu)
Do the same thing for pch2.cpp and source2.cpp, except set the Header File and Header Output File to pch2.h and pch2.pch. That works for me.