Pre-Compiled Header Design Question

Pre-Compiled Header Design Question - c++

I have code that uses a pre-compiled header. (previously done by someone else)
In it, they are including several .h files.
If I have classes that use common .h files that are not currently in the existing pre-compiled header, would tossing them in there be of any real benefit? Maybe compilation speed, but I was thinking it would clean up the classes/headers a bit too?
What are do's and don't with pre-compiled headers?

DO NOT rely on headers being included by your precompiled header for "code cleanup" by removing those headers from your other source files. This creates a nightmare if you ever want to stop using PCH. You always want your dependencies to be explicit in every source file. Just include them in both places -- there is no harm in it (assuming you have appropriate include guards in place).
A header file that is included by multiple source files is a good candidate for inclusion in the PCH (particularly if it is lengthy). I find that I don't take the advice too seriously to only put headers that rarely change into the PCH. But, this depends on your overall project structure. If you frequently do full builds, definitely avoid this advice. If you want to minimize the work in incremental rebuilds, then it's a consideration. In my experience, rebuilding the PCH is relatively fast, and the cost of this is far outweighed by the overall speedup of compilation in general (in most cases). I'm not sure if all PCH systems are smart enough to figure out that every source file does not need to be rebuilt when a header included in the PCH changes (VC++ is), but explictly #includeing everything you need in every translation unit will surely facilitate this (another reason you should not rely on what is included by your PCH)
If your compiler supports an option to show the #include tree for each file during compilation, this can be a great help to identify headers that should be included in the PCH (the ones that show up the most). I recently went through this on a project I'm working on (which was already using PCH, but not optimally) and sped up the build of 750K lines of C++ from roughly 1.5 hours to 15 minutes.

Put non-changing system includes into the precompiled header. That will speed up compilation. Don't put any of your own header files that you might change into the precompiled header, because each time you change them you will have to rebuild the entire precompiled header.

It is a trade-off: system/library headers definitely go in the PCH, for ones in your project it depends.
Our project has a large amount of generated code that is changed much less frequently that other parts of the project. These headers go in the PCH because they take a lot of time to process in each individual file. If you change them it is expensive, but you have to weigh that cost against the more frequent smaller savings of having them in the file.

Related

Every C++ header in a project as a precompiled header

The usual approach is to have one precompiled header in a project that contains the most common includes.
The problem is, that it is either too small or two big. When it is too small, it doesn't cover all the used headers so these have to be processed over and over in every module. When it is too large, it slows down the compilation too much for two reasons:
The project needs to be recompiled too often when you change something in header contained in the precompiled header.
The precompiled header is too large, so including it in every file actually slows down compilation.
What if I made all of the header files in a project precompiled. This would add some additional compiler work to precompile them, but then it would work very nicely, as no header would have to be processed twice (even preparing the precompiled header would use precompiled headers recursively), no extra stuff would have to be put into modules and only modules that are actually needed to be recompiled would be recompiled. In other words, for extra work O(N) complexity I would (theoretically) optimise O(n^2) comlexity of C++ includes. The precosseor to O(N), the processing of precompiled data would still be O(N^2), but at least minimised.
Did anyone tried this? Can it boost compile times in real life scenarios?

With GCC, the reliable way to use precompiled headers is to have one single (big) header (which #include-s many standard headers ...), and perhaps include some small header after the precompiled one.
See this answer for a more detailed explanation (for GCC specifically).

My own experience with GCC and Clang with precompiled headers is that you only can give a single pre-compiled header per compilation. See also the GCC documentation, I quote:
A precompiled header file can be used only when these conditions apply:
Only one precompiled header can be used in a particular compilation.
...
In practice, it's possible to compile every header to a precompiled header. (Recommended if you want to verify if everything is included, not recommended if you want to speed up compilation)
Based on your code, you can decide to use a different precompiled header based on the code that needs to be compiled. However, in general, it's a balancing act between compile time of the headers, compile-time of the CPP files and maintenance.
Adding a simple precompiled header that already contains several standard headers like string, vector, map, utility ... can already speed up your compilation with a remarkable percentage. (A long time ago, I've noticed a 15-20% on a small project)
The main gain you get from precompiled headers is that it:
only have to read 1 file instead of more, which improves on disk access
reads a binary format that's optimized for reading instead of plain text
it doesn't need to do all of the error checking as this was already done on creation
Even if you add a few headers that you don't use everywhere, it can still be much faster.
Lately, I also found the Clang build analyzer, it ain't ideal for big projects (see issue on github), though, it can give you some insights on where the time is being spent and what it can improve. (Or what you can improve in the codebase)
In all fairness, I don't use precompiled headers at this point in time. However, I do want to see it enabled on the project I'm working on.
Some other interesting reads:
https://medium.com/#unicorn_dev/speeding-up-the-build-of-c-and-c-projects-453ce85dd0e1
https://llunak.blogspot.com/2019/05/why-precompiled-headers-do-not-improve.html
https://www.bitsnbites.eu/faster-c-builds/

C/C++ - precompiled headers - encapsulation, how to, and why is config required?

I understand the idea that precompiling headers can speed up build times, but there are a handful of questions that have thus far prevented me from grokking them.
Why does using precompiled headers require the developer to configure anything?
Why can't the compiler (or linker/IDE?) just have individual precompiled header object files, in the same way it does for the source files (i.e. .obj files)? Dependencies are indicated by which source/header files include which other files, and it can already detect when source files change, so a regular build is normally not a full rebuild. Instead of requiring me to specify which headers get precompiled, etc., why isn't this all just always automatically on, and transparent to the developer?
As I understand the precompiled headers methodology in Visual Studio, the idea is that you get this one big header file (stdafx.h) that includes all the other header files that you want to be precompiled, and then you include that in all your source files that use any of those headers.
a. Am I understanding correctly?
b. Doesn't this break encapsulation? Often effectively including various (likely) unrelated items that you don't, which makes it harder to tell what libraries you're actually using, and what comes from where.
It seems to me that this implementation forces bad practices. What am I missing?
How do I utilize precompiled headers in Visual Studio (2013)?
Is there a cross-platform way to use or facilitate precompiled headers?
Thanks.

Why can't the compiler (or linker/IDE?) just have individual precompiled header object files, in the same way it does for the source files (i.e. .obj files)?
The answer to 1 and 2 is in the way how precompiled headers work. Assume you have a my_source_file.c:
#include "header1.h"
#include "header2.h"
int func(int x) { return x+1; }
my_other_source_file.c:
#include "header1.h"
int func2(int x) { return x-1; }
When you call compiler.exe my_source_file.c the compiler starts parsing your file. All the internal variables of the compiler (things like which types have been defined, what variables declared, etc) are called the compiler state.
After it has parsed header1.h it can save the state to the disk. Then, when compiling my_other_source_file.c, instead of parsing header1.h again, it can just load the state and continue.
That state is a precompiled header. It is literally just a dump of all the compiler variables in the moment after it has parsed the entire header.
Now, the question is why can't you have two state dumps, for header1.h and header2.h and just load them both.. Well, the states are not independent. The second file would be the state of header1.h + header2.h. So, what is usually done is you have one state which is after all the common header files have been compiled, and use that.
In theory, you could have one for every combination and use the appropriate one, but that is much more hassle than it's worth.
Some things that are side effects of how this is done:
Different compilers (including even minor versions) have different variables, so you can't reuse the precomps.
Since the dumped state started from the top of the file, your precomp must be the first include. There must be nothing that could influence the state (i.e. not #defines, typedefs, declarations) before including the precomp.
Any defines passed by the command line (-DMY_DEFINE=0) will not be picked up in the precompiled header.
Any defines passed by the command line while precompiling will be in effect for all source files that use the precomp.
For 3), refer to MSFT documentation.
For 4), most compilers support precompiled headers, and they generally work in the same way. You could configure your makefiles/build scripts to always precompile a certain header (e.g. stdafx.h) which would include all the other headers. As far as your source code goes, you'd always just #include "stdafx.h", regardless of the platform.

Why can't the compiler (or linker/IDE?) just have individual
precompiled header object files
C and C++ have no concept of modules. The traditional compiler has a preprocessor phase (which may be invoked as a separate program) that will include the files and the whole thing will get compiled to intermediate code. The compiler per se does not see includes (or comments, or trigraphs, etc.).
Add to this that the behaviour of a header file can change depending on the context in which it is included (think macros, for example) and you end up with either many precompiled versions of the same header, or an intermediate form that is basically the language itself.
Am I understanding correctly?
Mostly. The actual name is irrelevant, as it can be specified in the project options. stdafx.h is a relic of the early development of MFC, which was originally named AFX (Application Framework eXtensions). The preprocessor also treats includes of the precompiled header differently, as they are not looked up in the include paths. If the name matches what is in the project settings, the .pch is used automatically.
Doesn't this break encapsulation
Not really. Encapsulation is an object-oriented feature and has nothing to do with include files. It might increase coupling and dependencies by making some names available across all files, but in general, this is not a problem. Most includes in a precompiled header are standard headers or third-party libraries, that is, headers that may be large and fairly static.
As an example, a project I'm currently working on includes GTK, standard headers, boost and various internal libraries. It can be assumed that these headers never change. Even if they changed once a day, I probably compile every minute or so on average, so it is more than worth it.
The fact that all these names are available project-wide makes no difference. What would I gain by including boost/tokenizer.hpp in only one .cpp file? Perhaps some intellectual satisfaction of knowing that I can only use boost::char_separator in that particular file. But it certainly creates no problem. All these headers are part of a collection of utilities that my program can use. I am completely dependent on them, because I made a design decision early on to integrate them. I am tightly coupled with them by choice.
However, this program needs to access system-specific graphical facilities, and it needs to be portable on (at least) Debian and Windows. Therefore, I centralized all these operations in two files: windows.cpp and x11.cpp. They both include their own X11/Xlib.h and windows.h. This makes sure I don't use non-portable stuff elsewhere (which would however quickly be caught as I keep switching back and forth) and it satisfies my obsession with design. In reality, they could have been in the precompiled header. It doesn't make much of a difference.
Finally, none of the headers that are part of this specific program are in the precompiled header. This is where coupling and dependencies come into play. Reducing the number of available names forces you to think about design and architecture. If you try to use something and get an error saying that that name isn't declared, you don't blindly include the file. You stop and think: does it make sense for this name to be available here, or am I mixing up my user interface and data acquisition? It helps you separate the various parts of your program.
It also serves as a "compilation firewall", where modifying a header won't require you to rebuild the whole thing. This is more of a language issue than anything else, but in practice, it's still damn useful.
Trying to localize the GTK includes, for example, would not be helpful: all of my user interface uses it. I have no intention of supporting a different kind of toolkit. Indeed, I chose GTK because it was portable and I wouldn't have to port the interface myself.
What would be the point of only including the GTK headers in the user interface files? Obviously, it will prevent me from using GTK in files where I don't need to. But this is not solving any problem. I'm not inadvertently using GTK in places I shouldn't. It only slows down my build time.
How do I utilize precompiled headers in Visual Studio
This has been answered elsewhere. If you need more help, I suggest you ask a new question, as this one is already pretty big.
Is there a cross-platform way to use or facilitate precompiled headers?
A precompiled header is a feature provided by your compiler or build system. It is not inherently tied to a platform. If you are asking whether there is a portable way of using precompiled headers across compilers, then no. They are highly compiler-dependent.

Compilation speed improvements include guards vs. precompiled headers

I want to reduce compile time on a large project. Our primary compiler is Visual Studio 2010 but some of the code gets compiled in gcc. We are currently planning to ensure that all our .h files have both include guards as well as #pragma once, this will allow both Visual Studio and gcc to improve compile speed. Previously we had put more headers in the stdafx but we saw disadvantages that if one of those headers was changed, and you compiled a cpp without recompiling the precompiled header that the changes didn't take effect. This often caused us confusion. The current plan is to use precompiled headers for all stable headers or headers out of our control (they won't change) and for everything else use the include guards and #pragma once to help on compilation speed. Is there a reason why this path is poorly planned? Is there a benefit for compilation speed of include guards/#pragma once vs precompiled header or vise-versa that I am missing?

The approach that you are taking is sound, but if changes in one of the headers did not trigger recompilation of the precompiled headers you should check the dependencies in the project.
There are other things that can help in reducing compilation times, like avoiding the includes altogether. That is, use forward declarations in the headers and only include in the cpp files. That will reduce the compile time dependencies and speed up compilation.
I am not a fan of precompiled headers, so I usually just ensure that I include everything that needs including and don't include anything that doesn't.

Precompiled headers: do's and don'ts?

I know precompiled headers are used for speeding up compilations, but are there any do's and don'ts to what files I should include in them? For example, I have a project that uses a lot of boost libs so I just include the boost header files in stdafx.h (I'm using VS2008). Should I include every standard header file in them, too? Will this increase the size of my executeable even if I, for example, include <vector> but never use std::vector? Is it a bad idea to include my own project's header files in stdafx.h?

Generally speaking, every header file that you use across the application and that doesn't change often should go into the precompiled header file. This will speed up compilation because the precompiled header file gets compiled only once.
If you add a header file which changes often, you'll miss the point of the precompiled header file, because this often-changing header file will cause your whole project to recompile, possibly unnecessarily.
Specifically, defines a template class, so if you won't use std::vector, the overhead will not be big. However, I would advise against adding header files - however standard and generic - if you don't really need them. There IS some overhead to the compilation time, the binary size, and it could cause conflicts later in the project, so why add something if you don't really need it?

Pre-compiled headers don't affect the size of your executable, only the compilation speed. Since they are pre-compiled, they don't have to be re-compiled all the time. Windows.h is the primary beneficiary of this feature.

It's a good idea to include the c++ standard header-files and the boost library headers and any other headers from third party libraries that you frequently use. This will not affect the size of your executable.
However, you should not include headers from your own project, since the whole project needs to be rebuild whenever you make changes in these headers.

What to put in precompiled header? (MSVC)

What are the best candidates for a precompiled header file? Can I put STL and Boost headers there, even though they have templates? And will that reduce compile times?
Also, what are the best IDE settings to reduce compile times?

The quick answer: the STL and Boost headers do indeed belong in the precompiled header file, even though these header files define template classes.
When generating a precompiled header file, a compiler parses the header text (a significant task!), and converts it into a binary format that is optimised for the compiler's benefit.
Even though the template classes will be instantiated when other .cpp files are compiled, they will be instantiated from information in the precompiled header, which is significantly faster for the compiler to read.
(later addition)
One thing that you should not include in a precompiled header are files that are part of your project and are changed frequently, even if every single .CPP file includes these files.
The reason is this - the generation of the precompiled header can take a long time, because the boost, stl and windows libraries are very large.
You might have a simple file (eg "StringDefs.h") that everything uses. If StringDefs.h is included in stdafx.h, and one developer touches StringDefs.h, then every developer has to wait until the entire precompiled header recompiles. It would be much faster if StringDefs.h was left out of the precompiled header, and parsed along with each .CPP file.

One addition to Andrew Shepherd's answer. Use the precompiled header for header files that are external to your project, for files that change infrequently. If you're changing the header files in the current project all the time, it's probably not worth precompiling them.

I've written an article on techniques that reduce the compilation time. Among these techniques a post on precompiled header and its application can be found here. It also has a section on best practices that you may find interesting. CMake scripts that handle it transparently are included.

Put anything in the precompiled header that most of the .cpp files in that project would include anyway. This goes for any header file, really. This allows the compiler to parse these files once, and then reuse that information in all .cpp files in the same project.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js