Every C++ header in a project as a precompiled header - c++

The usual approach is to have one precompiled header in a project that contains the most common includes.
The problem is, that it is either too small or two big. When it is too small, it doesn't cover all the used headers so these have to be processed over and over in every module. When it is too large, it slows down the compilation too much for two reasons:
The project needs to be recompiled too often when you change something in header contained in the precompiled header.
The precompiled header is too large, so including it in every file actually slows down compilation.
What if I made all of the header files in a project precompiled. This would add some additional compiler work to precompile them, but then it would work very nicely, as no header would have to be processed twice (even preparing the precompiled header would use precompiled headers recursively), no extra stuff would have to be put into modules and only modules that are actually needed to be recompiled would be recompiled. In other words, for extra work O(N) complexity I would (theoretically) optimise O(n^2) comlexity of C++ includes. The precosseor to O(N), the processing of precompiled data would still be O(N^2), but at least minimised.
Did anyone tried this? Can it boost compile times in real life scenarios?

With GCC, the reliable way to use precompiled headers is to have one single (big) header (which #include-s many standard headers ...), and perhaps include some small header after the precompiled one.
See this answer for a more detailed explanation (for GCC specifically).

My own experience with GCC and Clang with precompiled headers is that you only can give a single pre-compiled header per compilation. See also the GCC documentation, I quote:
A precompiled header file can be used only when these conditions apply:
Only one precompiled header can be used in a particular compilation.
...
In practice, it's possible to compile every header to a precompiled header. (Recommended if you want to verify if everything is included, not recommended if you want to speed up compilation)
Based on your code, you can decide to use a different precompiled header based on the code that needs to be compiled. However, in general, it's a balancing act between compile time of the headers, compile-time of the CPP files and maintenance.
Adding a simple precompiled header that already contains several standard headers like string, vector, map, utility ... can already speed up your compilation with a remarkable percentage. (A long time ago, I've noticed a 15-20% on a small project)
The main gain you get from precompiled headers is that it:
only have to read 1 file instead of more, which improves on disk access
reads a binary format that's optimized for reading instead of plain text
it doesn't need to do all of the error checking as this was already done on creation
Even if you add a few headers that you don't use everywhere, it can still be much faster.
Lately, I also found the Clang build analyzer, it ain't ideal for big projects (see issue on github), though, it can give you some insights on where the time is being spent and what it can improve. (Or what you can improve in the codebase)
In all fairness, I don't use precompiled headers at this point in time. However, I do want to see it enabled on the project I'm working on.
Some other interesting reads:
https://medium.com/#unicorn_dev/speeding-up-the-build-of-c-and-c-projects-453ce85dd0e1
https://llunak.blogspot.com/2019/05/why-precompiled-headers-do-not-improve.html
https://www.bitsnbites.eu/faster-c-builds/

Related

How to precompile every header file?

I was reading clang modules documentation.
I understood a few things, I don't know if i'm true or wrong.
If I just pass -fmodules and -fbuiltin-module-map to every compiled source file as clang arguments, I will get every benefit of modules such as precompiled headers for standard includes, with 0 modifications, because of includes as imports.
There is no need to use precompiled headers. If I stick with modules.
My question is this. How can I automatically precompile every header file? Should I generate modulemap with script for headers, so they will be precompiled? One giant modulemap or one modulemap for every header?
I don't really care about C++ standard committee modules plan or logical aspect of modules. All I need is compilation speedup achieved via precompiling headers, without need of creating precompiled.hpp file (with every possible header) or any huge modifications in the code.
EDIT: Modules in clang implement cache, so in my vision they are pretty similiar to precompiled headers in the sense of the compilation time speedup.
I don't care for committee for time being, because my question is about clang modules (not C++ standard modules), which I know are experimental and are subject of change. I know my risks.
I want to have faster compilation and I see possible route here to take, which IMHO seems to be better than other methods.

Include guards in system headers and effect on compile speed

I'm currently speeding up compilation of a large C++ project (there is some C code too). Initially I'm
removing unnecessary system includes; and
introducing precompiled headers for common system includes such as stddef.h or vector. But not for includes like stdio.h or iostream which shouldn't be commonly used.
I'd thought that system headers would have include guards and thus would be covered by gcc's multiple include optimization. However it seems that not all headers follow this guideline. For example stdlib.h has #ifndef STDLIB_H at the start, and stddef.h follows the same pattern. But assert.h doesn't have a standard include guard, and nor does cstddef or cstdlib.
I've been using the -H option in gcc to track and subsequently analyse include dependencies in this project. What I've observed is that for a small sub-project with only 6 files, stdlib.h and stddef.h, which follow the include guards pattern, show up 6 times in the output, whereas the files that don't can show up 10s or 100s of times. I'm a little concerned that as some of these files may be on a remote network drive, compilation may be slower.
When I initially got precompiled headers working in the project, only including C++ headers such as vector, and headers from some internal libraries, I was a little surprised to only get a 20% performance increase. When I've done similar in Visual C++ I've seen greater increases. (Possibly related GCC build time doesn't benefit much from precompiled headers.)
I'm relatively new to gcc, so I may have missed something. My questions are:
Why do some headers have standard include guards, and some not?
Should I be concerned about the effect on compilation speed?
If so, how to address? I wouldn't mind adding assert to the precompiled headers, but I wouldn't want to add cstdlib, for example.

Pre-Compiled Header Design Question

I have code that uses a pre-compiled header. (previously done by someone else)
In it, they are including several .h files.
If I have classes that use common .h files that are not currently in the existing pre-compiled header, would tossing them in there be of any real benefit? Maybe compilation speed, but I was thinking it would clean up the classes/headers a bit too?
What are do's and don't with pre-compiled headers?
DO NOT rely on headers being included by your precompiled header for "code cleanup" by removing those headers from your other source files. This creates a nightmare if you ever want to stop using PCH. You always want your dependencies to be explicit in every source file. Just include them in both places -- there is no harm in it (assuming you have appropriate include guards in place).
A header file that is included by multiple source files is a good candidate for inclusion in the PCH (particularly if it is lengthy). I find that I don't take the advice too seriously to only put headers that rarely change into the PCH. But, this depends on your overall project structure. If you frequently do full builds, definitely avoid this advice. If you want to minimize the work in incremental rebuilds, then it's a consideration. In my experience, rebuilding the PCH is relatively fast, and the cost of this is far outweighed by the overall speedup of compilation in general (in most cases). I'm not sure if all PCH systems are smart enough to figure out that every source file does not need to be rebuilt when a header included in the PCH changes (VC++ is), but explictly #includeing everything you need in every translation unit will surely facilitate this (another reason you should not rely on what is included by your PCH)
If your compiler supports an option to show the #include tree for each file during compilation, this can be a great help to identify headers that should be included in the PCH (the ones that show up the most). I recently went through this on a project I'm working on (which was already using PCH, but not optimally) and sped up the build of 750K lines of C++ from roughly 1.5 hours to 15 minutes.
Put non-changing system includes into the precompiled header. That will speed up compilation. Don't put any of your own header files that you might change into the precompiled header, because each time you change them you will have to rebuild the entire precompiled header.
It is a trade-off: system/library headers definitely go in the PCH, for ones in your project it depends.
Our project has a large amount of generated code that is changed much less frequently that other parts of the project. These headers go in the PCH because they take a lot of time to process in each individual file. If you change them it is expensive, but you have to weigh that cost against the more frequent smaller savings of having them in the file.

What are the pros & cons of pre-compiled headers specifically in a GNU/Linux environment/tool-chain?

Pre-compiled headers seem like they can save a lot of time in large projects, but also seem to be a pain-in-the-ass that have some gotchas.
What are the pros & cons of using pre-compiled headers, and specifically as it pertains to using them in a Gnu/gcc/Linux environment?
The only potential benefit to precompiled headers is that if your builds are too slow, precompiled headers might speed them up. Potential cons:
More Makefile dependencies to get right; if they are wrong, you build the wrong thing fast. Not good.
In principle, not every header can be precompiled. (Think about putting some #define's before a #include.) So which cases does gcc actually get right? How much do you want to trust this bleeding edge feature.
If your builds are fast enough, there is no reason to use precompiled headers. If your builds are too slow, I'd consider
Buying faster hardware, which is cheap compared to salaries
Using a tool like AT&T nmake or like ccache (Dirk is right on), both of which use trustworthy techniques to avoid recompilations.
I can't talk to GNU/gcc/linux, but I've dealt with pre-compiled headers in vs2005:
Pros:
Saves compile time when you have large headers that lots of modules
include.
Works well on headers (say from a third party) that change very
infrequently.
Cons:
If you use them for headers that change a lot,
it can increase compile time.
Can be fiddly to set up and maintain.
There are cases where changes to headers are apparently ignored
if you don't force the pre-compiled header to compile.
The ccache caching frontend to gcc, g++, gfortran, ... works great for me. As its website says
ccache is a compiler cache. It acts as
a caching pre-processor to C/C++
compilers, using the -E compiler
switch and a hash to detect when a
compilation can be satisfied from
cache. This often results in a 5 to 10
times speedup in common compilations.
On Debian / Ubuntu, just do 'apt-get install ccache' and create soft-links in, say, /usr/local/bin with names gcc, g++, gfortran, c++, ... that point to /usr/bin/ccache.
[EDIT] To make this more explicit in response to some early comments: This provides essentially pre-compiled headers and sources by caching a larger chunk of the compilation step. So it uses an idea that is similar to pre-compiled headers, and carries it further. The speedups can be dramatic -- a factor of 5 to 10 as the website says.
For plain C, I would avoid precompiled headers. As you say, they can potentially cause problems, and preprocessing time is really small compared to the regular compilation.
For C++, precompiled headers can potentially save a lot of time, as C++ headers often contain large template code whose compilation is expensive. I have no practical experience with them, so I recommend you measure how much savings in compilation you get in your project. To so so, compile the entire project with precompiled headers once, then delete a single object file, and measure how long it takes to recompile that file.
The GNU gcc documentation discusses possible pitfalls with pre-compiled headers.
I am using PCH in a Qt project, which uses cmake as build system, and it saves a lot of time. I grabbed some PCH cmake scripts, which needed some tweaking, since they were quite old but it generally was easier to set up than I expected. I have to add, I am not much of a cmake expert.
I am including now a big part of Qt (QtCore, QtGui, QtOpenGL) and a few stable headers at once.
Pros:
For Qt classes,no forward declarations are needed, and of course no includes.
Fast.
Easy to setup.
Cons:
You can't include the PCH include in headers. This isn't much of a problem, exept you use Qt and let the build system translate the moc files seperatly, which happens to be exactly my configuration. In this case, you need to #include the qt headers in your headers, because the mocs are genreted from headers. Solution was to put additional include guards around the #include in the header.

What to put in precompiled header? (MSVC)

What are the best candidates for a precompiled header file? Can I put STL and Boost headers there, even though they have templates? And will that reduce compile times?
Also, what are the best IDE settings to reduce compile times?
The quick answer: the STL and Boost headers do indeed belong in the precompiled header file, even though these header files define template classes.
When generating a precompiled header file, a compiler parses the header text (a significant task!), and converts it into a binary format that is optimised for the compiler's benefit.
Even though the template classes will be instantiated when other .cpp files are compiled, they will be instantiated from information in the precompiled header, which is significantly faster for the compiler to read.
(later addition)
One thing that you should not include in a precompiled header are files that are part of your project and are changed frequently, even if every single .CPP file includes these files.
The reason is this - the generation of the precompiled header can take a long time, because the boost, stl and windows libraries are very large.
You might have a simple file (eg "StringDefs.h") that everything uses. If StringDefs.h is included in stdafx.h, and one developer touches StringDefs.h, then every developer has to wait until the entire precompiled header recompiles. It would be much faster if StringDefs.h was left out of the precompiled header, and parsed along with each .CPP file.
One addition to Andrew Shepherd's answer. Use the precompiled header for header files that are external to your project, for files that change infrequently. If you're changing the header files in the current project all the time, it's probably not worth precompiling them.
I've written an article on techniques that reduce the compilation time. Among these techniques a post on precompiled header and its application can be found here. It also has a section on best practices that you may find interesting. CMake scripts that handle it transparently are included.
Put anything in the precompiled header that most of the .cpp files in that project would include anyway. This goes for any header file, really. This allows the compiler to parse these files once, and then reuse that information in all .cpp files in the same project.