very long compilation times - c++

I'm working on a large solution that has thousands of source files, some of them can have over a thousand includes due to the use of Boost and dependency problems. While compiling in parallel on a 12 core Xeon E5-2690 v2 machine (windows 7) it takes up to 4 hours to rebuild the solution using Waf 1.7.13. What can I do to speed things up?

A few things that come to mind:
Use forward declarations and PIMPL.
Review your code base to see if you have created unnecessary templates when normal classes or functions would have been sufficient.
Try to implement your own templates in terms of non-generic implementations operating on void* where applicable, the templates serving merely as type-safe wrappers (see "Item 42: Use private inheritance judiciously" in "More Effective C++" for a nice example).
Check your compiler's documentation for precompiled headers.
Refactor your application's entire architecture so that there are more internal libraries and a smaller application layer, the goal being that in the long run the libraries become so stable that you don't have to rebuild them all the time.
Experiment with optimisation flags. See if you can turn down optimisation in several specific compilation units where optimisation doesn't make a measurable difference in execution speed or binary size yet signficantly increases compilation times.

If refactoring to remove dependencies is not an option, and the build architecture is so unreliable that you have to clean and rebuild all from time to time (in a perfect world this should never happen), you can:
Improve your machine: use an SSD hard disk and/or lot of RAM. If the project has thousand of source files, you will need to hit the disk a lot of times, with random but cacheable accesses. For 12 core machine with hyper threading I would suggest no less than 24gb of ram, probably better going toward 32. You should measure what is your source code size and how much ram is used by the concurrent 24 compilers you can run in parallel.
Use a compiler cache. I don't know how to setup such environment from windows, and with your build system, so I can't suggest much here.

The culprit is likely the structure of your project itself: the expected time for rebuilding is roughly proportional to the number of source files times the number of headers they include. With strong template based programming, that effort grows with the square of source files.
So, you have exactly two contrary options to get your total compilation time down:
You reduce the number of headers included by each source file. That means you have to avoid any templates that use templates. Try to reduce inter-template dependencies as much as you can.
Program only in headers and only use a single .cpp file which instanciates all the different templates in your application. This avoids any recompilation of a header.
A possible variant of this is to create a precompiled header from all header files in your project, so that precompiling the header takes the bulk of the building time.
Of course, option 2 means that there is no such thing as an incremental build, you will need to recompile the entire project everytime you compile. This can make your project entirely unmaintainable. So, I would strongly suggest to go for option 1. But that requires a different programming style (one that uses templates very sparingly) than your project seems to be written in.
In either case, there is no magic bullet: It is not likely possible to make a significant change to the compilation time without restructuring the entire project.

Some suggestions you could try relatively easily:
Use distributed compilation using IncrediBuild or such
Use "unity" builds, i.e. including several cpp files in a single cpp file and compiling only the unity cpp's (you can have a simple tool building the unity cpp's with given number of cpp files in each). Even though I'm not big fan of this technique due to its shortcomings I have seen it employed quite often in projects with bad physical structure to optimize compile & link times.
Use "#pragma once" in headers
Use precompiled headers
Optimize redundant #include's in header files. I don't know if there's a tool for it, but you could easily build a tool which comments out #includes in headers and tries to compile them to see which ones are not needed.
Profile #include file compilation speeds and focus on addressing the worst performers and the ones that are deepest in the #include chain. I.e. use forward declarations instead of #include's, etc.
Buy more/better hardware

Related

Faster build times in C++ [duplicate]

I once worked on a C++ project that took about an hour and a half for a full rebuild. Small edit, build, test cycles took about 5 to 10 minutes. It was an unproductive nightmare.
What is the worst build times you ever had to handle?
What strategies have you used to improve build times on large projects?
Update:
How much do you think the language used is to blame for the problem? I think C++ is prone to massive dependencies on large projects, which often means even simple changes to the source code can result in a massive rebuild. Which language do you think copes with large project dependency issues best?
Forward declaration
pimpl idiom
Precompiled headers
Parallel compilation (e.g. MPCL add-in for Visual Studio).
Distributed compilation (e.g. Incredibuild for Visual Studio).
Incremental build
Split build in several "projects" so not compile all the code if not needed.
[Later Edit]
8. Buy faster machines.
My strategy is pretty simple - I don't do large projects. The whole thrust of modern computing is away from the giant and monolithic and towards the small and componentised. So when I work on projects, I break things up into libraries and other components that can be built and tested independantly, and which have minimal dependancies on each other. A "full build" in this kind of environment never actually takes place, so there is no problem.
One trick that sometimes helps is to include everything into one .cpp file. Since includes are processed once per file, this can save you a lot of time. (The downside to this is that it makes it impossible for the compiler to parallelize compilation)
You should be able to specify that multiple .cpp files should be compiled in parallel (-j with make on linux, /MP on MSVC - MSVC also has an option to compile multiple projects in parallel. These are separate options, and there's no reason why you shouldn't use both)
In the same vein, distributed builds (Incredibuild, for example), may help take the load off a single system.
SSD disks are supposed to be a big win, although I haven't tested this myself (but a C++ build touches a huge number of files, which can quickly become a bottleneck).
Precompiled headers can help too, when used with care. (They can also hurt you, if they have to be recompiled too often).
And finally, trying to minimize dependencies in the code itself is important. Use the pImpl idiom, use forward declarations, keep the code as modular as possible. In some cases, use of templates may help you decouple classes and minimize dependencies. (In other cases, templates can slow down compilation significantly, of course)
But yes, you're right, this is very much a language thing. I don't know of another language which suffers from the problem to this extent. Most languages have a module system that allows them to eliminate header files, which area huge factor. C has header files, but is such a simple language that compile times are still manageable. C++ gets the worst of both worlds. A big complex language, and a terrible primitive build mechanism that requires a huge amount of code to be parsed again and again.
Multi core compilation. Very fast with 8 cores compiling on the I7.
Incremental linking
External constants
Removed inline methods on C++ classes.
The last two gave us a reduced linking time from around 12 minutes to 1-2 minutes. Note that this is only needed if things have a huge visibility, i.e. seen "everywhere" and if there are many different constants and classes.
Cheers
IncrediBuild
Unity Builds
Incredibuild
Pointer to implementation
forward declarations
compiling "finished" sections of the proejct into dll's
ccache & distcc (for C/C++ projects) -
ccache caches compiled output, using the pre-processed file as the 'key' for finding the output. This is great because pre-processing is pretty quick, and quite often changes that force recompile don't actually change the source for many files. Also, it really speeds up a full re-compile. Also nice is the instance where you can have a shared cache among team members. This means that only the first guy to grab the latest code actually compiles anything.
distcc does distributed compilation across a network of machines. This is only good if you HAVE a network of machines to use for compilation. It goes well with ccache, and only moves the pre-processed source around, so the only thing you have to worry about on the compiler engine systems is that they have the right compiler (no need for headers or your entire source tree to be visible).
The best suggestion is to build makefiles that actually understand dependencies and do not automatically rebuild the world for a small change. But, if a full rebuild takes 90 minutes, and a small rebuild takes 5-10 minutes, odds are good that your build system already does that.
Can the build be done in parallel? Either with multiple cores, or with multiple servers?
Checkin pre-compiled bits for pieces that really are static and do not need to be rebuilt every time. 3rd party tools/libraries that are used, but not altered are a good candidate for this treatment.
Limit the build to a single 'stream' if applicable. The 'full product' might include things like a debug version, or both 32 and 64 bit versions, or may include help files or man pages that are derived/built every time. Removing components that are not necessary for development can dramatically reduce the build time.
Does the build also package the product? Is that really required for development and testing? Does the build incorporate some basic sanity tests that can be skipped?
Finally, you can re-factor the code base to be more modular and to have fewer dependencies. Large Scale C++ Software Design is an excellent reference for learning to decouple large software products into something that is easier to maintain and faster to build.
EDIT: Building on a local filesystem as opposed to a NFS mounted filesystem can also dramatically speed up build times.
Fiddle with the compiler optimisation flags,
use option -j4 for gmake for parallel compilation (multicore or single core)
if you are using clearmake , use winking
we can take out the debug flags..in extreme cases.
Use some powerful servers.
This book Large-Scale C++ Software Design has very good advice I've used in past projects.
Minimize your public API
Minimize inline functions in your API. (Unfortunately this also increases linker requirements).
Maximize forward declarations.
Reduce coupling between code. For instance pass in two integers to a function, for coordinates, instead of your custom Point class that has it's own header file.
Use Incredibuild. But it has some issues sometimes.
Do NOT put code that get exported from two different modules in the SAME header file.
Use the PImple idiom. Mentioned before, but bears repeating.
Use Pre-compiled headers.
Avoid C++/CLI (i.e. managed c++). Linker times are impacted too.
Avoid using a global header file that includes 'everything else' in your API.
Don't put a dependency on a lib file if your code doesn't really need it.
Know the difference between including files with quotes and angle brackets.
Powerful compilation machines and parallel compilers. We also make sure the full build is needed as little as possible. We don't alter the code to make it compile faster.
Efficiency and correctness is more important than compilation speed.
In Visual Studio, you can set number of project to compile at a time. Its default value is 2, increasing that would reduce some time.
This will help if you don't want to mess with the code.
This is the list of things we did for a development under Linux :
As Warrior noted, use parallel builds (make -jN)
We use distributed builds (currently icecream which is very easy to setup), with this we can have tens or processors at a given time. This also has the advantage of giving the builds to the most powerful and less loaded machines.
We use ccache so that when you do a make clean, you don't have to really recompile your sources that didn't change, it's copied from a cache.
Note also that debug builds are usually faster to compile since the compiler doesn't have to make optimisations.
We tried creating proxy classes once.
These are really a simplified version of a class that only includes the public interface, reducing the number of internal dependencies that need to be exposed in the header file. However, they came with a heavy price of spreading each class over several files that all needed to be updated when changes to the class interface were made.
In general large C++ projects that I've worked on that had slow build times were pretty messy, with lots of interdependencies scattered through the code (the same include files used in most cpps, fat interfaces instead of slim ones). In those cases, the slow build time was just a symptom of the larger problem, and a minor symptom at that. Refactoring to make clearer interfaces and break code out into libraries improved the architecture, as well as the build time. When you make a library, it forces you to think about what is an interface and what isn't, which will actually (in my experience) end up improving the code base. If there's no technical reason to have to divide the code, some programmers through the course of maintenance will just throw anything into any header file.
Cătălin Pitiș covered a lot of good things. Other ones we do:
Have a tool that generates reduced Visual Studio .sln files for people working in a specific sub-area of a very large overall project
Cache DLLs and pdbs from when they are built on CI for distribution on developer machines
For CI, make sure that the link machine in particular has lots of memory and high-end drives
Store some expensive-to-regenerate files in source control, even though they could be created as part of the build
Replace Visual Studio's checking of what needs to be relinked by our own script tailored to our circumstances
It's a pet peeve of mine, so even though you already accepted an excellent answer, I'll chime in:
In C++, it's less the language as such, but the language-mandated build model that was great back in the seventies, and the header-heavy libraries.
The only thing that is wrong about Cătălin Pitiș' reply: "buy faster machines" should go first. It is the easyest way with the least impact.
My worst was about 80 minutes on an aging build machine running VC6 on W2K Professional. The same project (with tons of new code) now takes under 6 minutes on a machine with 4 hyperthreaded cores, 8G RAM Win 7 x64 and decent disks. (A similar machine, about 10..20% less processor power, with 4G RAM and Vista x86 takes twice as long)
Strangely, incremental builds are most of the time slower than full rebuuilds now.
Full build is about 2 hours. I try to avoid making modification to the base classes and since my work is mainly on the implementation of these base classes I only need to build small components (couple of minutes).
Create some unit test projects to test individual libraries, so that if you need to edit low level classes that would cause a huge rebuild, you can use TDD to know your new code works before you rebuild the entire app. The John Lakos book as mentioned by Themis has some very practical advice for restructuring your libraries to make this possible.

GNU tool to analyze and reduce compile time for my application

I am using SUSE10 (64 bit)/AIX (5.1) and HP I64 (11.3) to compile my application. Just to give some background, my application has around 200KLOC (2Lacs) lines of code (without templates). It is purely C++ code. From measurements, I see that compile time ranges from 45 minutes(SUSE) to around 75 minutes(AIX).
Question 1 : Is this time normal (acceptable)?
Question 2 : I want to re-engineer the code arrangement and reduce the compile time. Is there any GNU tool which can help me to do this?
PS :
a. Most of the question in stackoverflow was related to Visual Studio, so I had to post a separate question.
b. I use gcc 4.1.2 version.
c. Another info (which might be useful) is code is spread across around 130 .cpp files but code distribution varies from 1KLOC to 8 KLOCK in a file.
Thanks in advance for you help!!!
Edit 1 (after comments)
#PaulR "Are you using makefiles for this ? Do you always do a full (clean) build or just build incrementally ?"
Yes we are using make files for project building.
Sometimes we are forced to do the full build (like over-night build/run or automated run or refresh complete code since many members have changed many files). So I have posted in general sense.
Excessive (or seemingly excessive) compilation times are often caused by an overly complicated include file hierarchy.
While not exactly a tool for this purpose, doxygen could be quite helpful: among other charts it can display the include file hierarchy for every source file in the project. I have found many interesting and convoluted include dependencies in my projects.
Read John Lakos's Large-Scale C++ Design for some very good methods of analysing and re-organising the structure of the project in order to minimise dependencies. Ultimately the time taken to build a large project increases as the amount of code increases, but also as the dependencies increase (or at least the impact of changes to header files increases as the dependencies increase). So minimising those dependencies is one thing to aim for. Lakos's concept of Levelization is very helpful in working out how to split several large monolothic inter-dependent libraries into something with a much better structure.
I can't address your specific questions but I use ccache to help with compile times, which caches object files and will use the same ones if source files do not change. If you are using SuSE, it should come with your distribution.
In addition to the already mentioned ccache, have a look at distcc. Throwing more hardware at such a scalable problem is cheap and simple.
Long compile times in large C++ projects are almost always caused by inappropriate use of header files. Section 9.3.2 of The C++ Programming Language provide some useful points this. Precompiling header files can considerably reduce compile time of large projects. See the GNU documentation on Precompiled Headers for more information.
Make sure that your main make targets can be executed in parallel (make -j <CPU_COUNT+1>) and of course try to use ccache. In addition we experimented with ccache and RAM disks, if you export CCACHE_DIR and point it to a RAM disk this will speed up your compilation process as well.

#include all .cpp files into a single compilation unit?

Want to improve this post? Provide detailed answers to this question, including citations and an explanation of why your answer is correct. Answers without enough detail may be edited or deleted.
I recently had cause to work with some Visual Studio C++ projects with the usual Debug and Release configurations, but also 'Release All' and 'Debug All', which I had never seen before.
It turns out the author of the projects has a single ALL.cpp which #includes all other .cpp files. The *All configurations just build this one ALL.cpp file. It is of course excluded from the regular configurations, and regular configurations don't build ALL.cpp
I just wondered if this was a common practice? What benefits does it bring? (My first reaction was that it smelled bad.)
What kinds of pitfalls are you likely to encounter with this? One I can think of is if you have anonymous namespaces in your .cpps, they're no longer 'private' to that cpp but now visible in other cpps as well?
All the projects build DLLs, so having data in anonymous namespaces wouldn't be a good idea, right? But functions would be OK?
It's referred to by some (and google-able) as a "Unity Build". It links insanely fast and compiles reasonably quickly as well. It's great for builds you don't need to iterate on, like a release build from a central server, but it isn't necessarily for incremental building.
And it's a PITA to maintain.
EDIT: here's the first google link for more info: http://buffered.io/posts/the-magic-of-unity-builds/
The thing that makes it fast is that the compiler only needs to read in everything once, compile out, then link, rather than doing that for every .cpp file.
Bruce Dawson has a much better write up about this on his blog: http://randomascii.wordpress.com/2014/03/22/make-vc-compiles-fast-through-parallel-compilation/
Unity builds improved build speeds for three main reasons. The first reason is that all of the shared header files only need to be parsed once. Many C++ projects have a lot of header files that are included by most or all CPP files and the redundant parsing of these is the main cost of compilation, especially if you have many short source files. Precompiled header files can help with this cost, but usually there are a lot of header files which are not precompiled.
The next main reason that unity builds improve build speeds is because the compiler is invoked fewer times. There is some startup cost with invoking the compiler.
Finally, the reduction in redundant header parsing means a reduction in redundant code-gen for inlined functions, so the total size of object files is smaller, which makes linking faster.
Unity builds can also give better code-gen.
Unity builds are NOT faster because of reduced disk I/O. I have profiled many builds with xperf and I know what I'm talking about. If you have sufficient memory then the OS disk cache will avoid the redundant I/O - subsequent reads of a header will come from the OS disk cache. If you don't have enough memory then unity builds could even make build times worse by causing the compiler's memory footprint to exceed available memory and get paged out.
Disk I/O is expensive, which is why all operating systems aggressively cache data in order to avoid redundant disk I/O.
I wonder if that ALL.cpp is attempting to put the entire project within a single compilation unit, to improve the ability for the compiler to optimize the program for size?
Normally some optimizations are only performed within distinct compilation units, such as removal of duplicate code and inlining.
That said, I seem to remember that recent compilers (Microsoft's, Intel's, but I don't think this includes GCC) can do this optimization across multiple compilation units, so I suspect that this 'trick' is unneccessary.
That said, it would be curious to see if there is indeed any difference.
I agree with Bruce; from my experience I had tried implementing the Unity Build for one of my .dll projects which had a ton of header includes and lots of .cpps; to bring down the overall Compilation time on the VS2010(had already exhausted the Incremental Build options) but rather than cutting down the Compilation time, I ran out of memory and the Build not even able to finish the Compilation.
However to add; I did find enabling the Multi-Processor Compilation option in Visual Studio quite helps in cutting down the Compilation time; I am not sure if such an option is available across other platform compilers.
In addition to Bruce Dawson's excellent answer the following link brings more insights onto the pros and cons of unity builds - https://github.com/onqtam/ucm#unity-builds
We have a multi platform project for MacOS and Windows. We have lots of large templates and many headers to include. We halfed build time with Visual Studio 2017 msvc using precompiled headers. For MacOS we tried several days to reduce build time but precompiled headers only reduced build time to 85%. Using a single .cpp-File was the breakthrough. We simply create this ALL.cpp with a script. Our ALL.cpp contains a list of includes of all .cpp-Files of our project. We reduce build time to 30% with XCode 9.0. The only drawback was we had to give names to all private unamed compilation unit namespaces in .cpp-Files. Otherwise the local names will clash.

What techniques can be used to speed up C++ compilation times?

What techniques can be used to speed up C++ compilation times?
This question came up in some comments to Stack Overflow question C++ programming style, and I'm interested to hear what ideas there are.
I've seen a related question, Why does C++ compilation take so long?, but that doesn't provide many solutions.
Language techniques
Pimpl Idiom
Take a look at the Pimpl idiom here, and here, also known as an opaque pointer or handle classes. Not only does it speed up compilation, it also increases exception safety when combined with a non-throwing swap function. The Pimpl idiom lets you reduce the dependencies between headers and reduces the amount of recompilation that needs to be done.
Forward Declarations
Wherever possible, use forward declarations. If the compiler only needs to know that SomeIdentifier is a struct or a pointer or whatever, don't include the entire definition, forcing the compiler to do more work than it needs to. This can have a cascading effect, making this way slower than they need to be.
The I/O streams are particularly known for slowing down builds. If you need them in a header file, try #including <iosfwd> instead of <iostream> and #include the <iostream> header in the implementation file only. The <iosfwd> header holds forward declarations only. Unfortunately the other standard headers don't have a respective declarations header.
Prefer pass-by-reference to pass-by-value in function signatures. This will eliminate the need to #include the respective type definitions in the header file and you will only need to forward-declare the type. Of course, prefer const references to non-const references to avoid obscure bugs, but this is an issue for another question.
Guard Conditions
Use guard conditions to keep header files from being included more than once in a single translation unit.
#pragma once
#ifndef filename_h
#define filename_h
// Header declarations / definitions
#endif
By using both the pragma and the ifndef, you get the portability of the plain macro solution, as well as the compilation speed optimization that some compilers can do in the presence of the pragma once directive.
Reduce interdependency
The more modular and less interdependent your code design is in general, the less often you will have to recompile everything. You can also end up reducing the amount of work the compiler has to do on any individual block at the same time, by virtue of the fact that it has less to keep track of.
Compiler options
Precompiled Headers
These are used to compile a common section of included headers once for many translation units. The compiler compiles it once, and saves its internal state. That state can then be loaded quickly to get a head start in compiling another file with that same set of headers.
Be careful that you only include rarely changed stuff in the precompiled headers, or you could end up doing full rebuilds more often than necessary. This is a good place for STL headers and other library include files.
ccache is another utility that takes advantage of caching techniques to speed things up.
Use Parallelism
Many compilers / IDEs support using multiple cores/CPUs to do compilation simultaneously. In GNU Make (usually used with GCC), use the -j [N] option. In Visual Studio, there's an option under preferences to allow it to build multiple projects in parallel. You can also use the /MP option for file-level paralellism, instead of just project-level paralellism.
Other parallel utilities:
Incredibuild
Unity Build
distcc
Use a Lower Optimization Level
The more the compiler tries to optimize, the harder it has to work.
Shared Libraries
Moving your less frequently modified code into libraries can reduce compile time. By using shared libraries (.so or .dll), you can reduce linking time as well.
Get a Faster Computer
More RAM, faster hard drives (including SSDs), and more CPUs/cores will all make a difference in compilation speed.
I work on the STAPL project which is a heavily-templated C++ library. Once in a while, we have to revisit all the techniques to reduce compilation time. In here, I have summarized the techniques we use. Some of these techniques are already listed above:
Finding the most time-consuming sections
Although there is no proven correlation between the symbol lengths and compilation time, we have observed that smaller average symbol sizes can improve compilation time on all compilers. So your first goals it to find the largest symbols in your code.
Method 1 - Sort symbols based on size
You can use the nm command to list the symbols based on their sizes:
nm --print-size --size-sort --radix=d YOUR_BINARY
In this command the --radix=d lets you see the sizes in decimal numbers (default is hex). Now by looking at the largest symbol, identify if you can break the corresponding class and try to redesign it by factoring the non-templated parts in a base class, or by splitting the class into multiple classes.
Method 2 - Sort symbols based on length
You can run the regular nm command and pipe it to your favorite script (AWK, Python, etc.) to sort the symbols based on their length. Based on our experience, this method identifies the largest trouble making candidates better than method 1.
Method 3 - Use Templight
"Templight is a Clang-based tool to profile the time and memory consumption of template instantiations and to perform interactive debugging sessions to gain introspection into the template instantiation process".
You can install Templight by checking out LLVM and Clang (instructions) and applying the Templight patch on it. The default setting for LLVM and Clang is on debug and assertions, and these can impact your compilation time significantly. It does seem like Templight needs both, so you have to use the default settings. The process of installing LLVM and Clang should take about an hour or so.
After applying the patch you can use templight++ located in the build folder you specified upon installation to compile your code.
Make sure that templight++ is in your PATH. Now to compile add the following switches to your CXXFLAGS in your Makefile or to your command line options:
CXXFLAGS+=-Xtemplight -profiler -Xtemplight -memory -Xtemplight -ignore-system
Or
templight++ -Xtemplight -profiler -Xtemplight -memory -Xtemplight -ignore-system
After compilation is done, you will have a .trace.memory.pbf and .trace.pbf generated in the same folder. To visualize these traces, you can use the Templight Tools that can convert these to other formats. Follow these instructions to install templight-convert. We usually use the callgrind output. You can also use the GraphViz output if your project is small:
$ templight-convert --format callgrind YOUR_BINARY --output YOUR_BINARY.trace
$ templight-convert --format graphviz YOUR_BINARY --output YOUR_BINARY.dot
The callgrind file generated can be opened using kcachegrind in which you can trace the most time/memory consuming instantiation.
Reducing the number of template instantiations
Although there are no exact solution for reducing the number of template instantiations, there are a few guidelines that can help:
Refactor classes with more than one template arguments
For example, if you have a class,
template <typename T, typename U>
struct foo { };
and both of T and U can have 10 different options, you have increased the possible template instantiations of this class to 100. One way to resolve this is to abstract the common part of the code to a different class. The other method is to use inheritance inversion (reversing the class hierarchy), but make sure that your design goals are not compromised before using this technique.
Refactor non-templated code to individual translation units
Using this technique, you can compile the common section once and link it with your other TUs (translation units) later on.
Use extern template instantiations (since C++11)
If you know all the possible instantiations of a class you can use this technique to compile all cases in a different translation unit.
For example, in:
enum class PossibleChoices = {Option1, Option2, Option3}
template <PossibleChoices pc>
struct foo { };
We know that this class can have three possible instantiations:
template class foo<PossibleChoices::Option1>;
template class foo<PossibleChoices::Option2>;
template class foo<PossibleChoices::Option3>;
Put the above in a translation unit and use the extern keyword in your header file, below the class definition:
extern template class foo<PossibleChoices::Option1>;
extern template class foo<PossibleChoices::Option2>;
extern template class foo<PossibleChoices::Option3>;
This technique can save you time if you are compiling different tests with a common set of instantiations.
NOTE : MPICH2 ignores the explicit instantiation at this point and always compiles the instantiated classes in all compilation units.
Use unity builds
The whole idea behind unity builds is to include all the .cc files that you use in one file and compile that file only once. Using this method, you can avoid reinstantiating common sections of different files and if your project includes a lot of common files, you probably would save on disk accesses as well.
As an example, let's assume you have three files foo1.cc, foo2.cc, foo3.cc and they all include tuple from STL. You can create a foo-all.cc that looks like:
#include "foo1.cc"
#include "foo2.cc"
#include "foo3.cc"
You compile this file only once and potentially reduce the common instantiations among the three files. It is hard to generally predict if the improvement can be significant or not. But one evident fact is that you would lose parallelism in your builds (you can no longer compile the three files at the same time).
Further, if any of these files happen to take a lot of memory, you might actually run out of memory before the compilation is over. On some compilers, such as GCC, this might ICE (Internal Compiler Error) your compiler for lack of memory. So don't use this technique unless you know all the pros and cons.
Precompiled headers
Precompiled headers (PCHs) can save you a lot of time in compilation by compiling your header files to an intermediate representation recognizable by a compiler. To generate precompiled header files, you only need to compile your header file with your regular compilation command. For example, on GCC:
$ g++ YOUR_HEADER.hpp
This will generate a YOUR_HEADER.hpp.gch file (.gch is the extension for PCH files in GCC) in the same folder. This means that if you include YOUR_HEADER.hpp in some other file, the compiler will use your YOUR_HEADER.hpp.gch instead of YOUR_HEADER.hpp in the same folder before.
There are two issues with this technique:
You have to make sure that the header files being precompiled is stable and is not going to change (you can always change your makefile)
You can only include one PCH per compilation unit (on most of compilers). This means that if you have more than one header file to be precompiled, you have to include them in one file (e.g., all-my-headers.hpp). But that means that you have to include the new file in all places. Fortunately, GCC has a solution for this problem. Use -include and give it the new header file. You can comma separate different files using this technique.
For example:
g++ foo.cc -include all-my-headers.hpp
Use unnamed or anonymous namespaces
Unnamed namespaces (a.k.a. anonymous namespaces) can reduce the generated binary sizes significantly. Unnamed namespaces use internal linkage, meaning that the symbols generated in those namespaces will not be visible to other TU (translation or compilation units). Compilers usually generate unique names for unnamed namespaces. This means that if you have a file foo.hpp:
namespace {
template <typename T>
struct foo { };
} // Anonymous namespace
using A = foo<int>;
And you happen to include this file in two TUs (two .cc files and compile them separately). The two foo template instances will not be the same. This violates the One Definition Rule (ODR). For the same reason, using unnamed namespaces is discouraged in the header files. Feel free to use them in your .cc files to avoid symbols showing up in your binary files. In some cases, changing all the internal details for a .cc file showed a 10% reduction in the generated binary sizes.
Changing visibility options
In newer compilers you can select your symbols to be either visible or invisible in the Dynamic Shared Objects (DSOs). Ideally, changing the visibility can improve compiler performance, link time optimizations (LTOs), and generated binary sizes. If you look at the STL header files in GCC you can see that it is widely used. To enable visibility choices, you need to change your code per function, per class, per variable and more importantly per compiler.
With the help of visibility you can hide the symbols that you consider them private from the generated shared objects. On GCC you can control the visibility of symbols by passing default or hidden to the -visibility option of your compiler. This is in some sense similar to the unnamed namespace but in a more elaborate and intrusive way.
If you would like to specify the visibilities per case, you have to add the following attributes to your functions, variables, and classes:
__attribute__((visibility("default"))) void foo1() { }
__attribute__((visibility("hidden"))) void foo2() { }
__attribute__((visibility("hidden"))) class foo3 { };
void foo4() { }
The default visibility in GCC is default (public), meaning that if you compile the above as a shared library (-shared) method, foo2 and class foo3 will not be visible in other TUs (foo1 and foo4 will be visible). If you compile with -visibility=hidden then only foo1 will be visible. Even foo4 would be hidden.
You can read more about visibility on GCC wiki.
I'd recommend these articles from "Games from Within, Indie Game Design And Programming":
Physical Structure and C++ – Part 1: A First Look
Physical Structure and C++ – Part 2: Build Times
Even More Experiments with Includes
How Incredible Is Incredibuild?
The Care and Feeding of Pre-Compiled Headers
The Quest for the Perfect Build System
The Quest for the Perfect Build System (Part 2)
Granted, they are pretty old - you'll have to re-test everything with the latest versions (or versions available to you), to get realistic results. Either way, it is a good source for ideas.
One technique which worked quite well for me in the past: don't compile multiple C++ source files independently, but rather generate one C++ file which includes all the other files, like this:
// myproject_all.cpp
// Automatically generated file - don't edit this by hand!
#include "main.cpp"
#include "mainwindow.cpp"
#include "filterdialog.cpp"
#include "database.cpp"
Of course this means you have to recompile all of the included source code in case any of the sources changes, so the dependency tree gets worse. However, compiling multiple source files as one translation unit is faster (at least in my experiments with MSVC and GCC) and generates smaller binaries. I also suspect that the compiler is given more potential for optimizations (since it can see more code at once).
This technique breaks in various cases; for instance, the compiler will bail out in case two or more source files declare a global function with the same name. I couldn't find this technique described in any of the other answers though, that's why I'm mentioning it here.
For what it's worth, the KDE Project used this exact same technique since 1999 to build optimized binaries (possibly for a release). The switch to the build configure script was called --enable-final. Out of archaeological interest I dug up the posting which announced this feature: http://lists.kde.org/?l=kde-devel&m=92722836009368&w=2
I will just link to my other answer: How do YOU reduce compile time, and linking time for Visual C++ projects (native C++)?. Another point I want to add, but which causes often problems is to use precompiled headers. But please, only use them for parts which hardly ever change (like GUI toolkit headers). Otherwise, they will cost you more time than they save you in the end.
Another option is, when you work with GNU make, to turn on -j<N> option:
-j [N], --jobs[=N] Allow N jobs at once; infinite jobs with no arg.
I usually have it at 3 since I've got a dual core here. It will then run compilers in parallel for different translation units, provided there are no dependencies between them. Linking cannot be done in parallel, since there is only one linker process linking together all object files.
But the linker itself can be threaded, and this is what the GNU gold ELF linker does. It's optimized threaded C++ code which is said to link ELF object files a magnitude faster than the old ld (and was actually included into binutils).
There's an entire book on this topic, which is titled Large-Scale C++ Software Design (written by John Lakos).
The book pre-dates templates, so to the contents of that book add "using templates, too, can make the compiler slower".
Once you have applied all the code tricks above (forward declarations, reducing header inclusion to the minimum in public headers, pushing most details inside the implementation file with Pimpl...) and nothing else can be gained language-wise, consider your build system. If you use Linux, consider using distcc (distributed compiler) and ccache (cache compiler).
The first one, distcc, executes the preprocessor step locally and then sends the output to the first available compiler in the network. It requires the same compiler and library versions in all the configured nodes in the network.
The latter, ccache, is a compiler cache. It again executes the preprocessor and then check with an internal database (held in a local directory) if that preprocessor file has already been compiled with the same compiler parameters. If it does, it just pops up the binary and output from the first run of the compiler.
Both can be used at the same time, so that if ccache does not have a local copy it can send it trough the net to another node with distcc, or else it can just inject the solution without further processing.
Here are some:
Use all processor cores by starting a multiple-compile job (make -j2 is a good example).
Turn off or lower optimizations (for example, GCC is much faster with -O1 than -O2 or -O3).
Use precompiled headers.
When I came out of college, the first real production-worthy C++ code I saw had these arcane #ifndef ... #endif directives in between them where the headers were defined. I asked the guy who was writing the code about these overarching things in a very naive fashion and was introduced to world of large-scale programming.
Coming back to the point, using directives to prevent duplicate header definitions was the first thing I learned when it came to reducing compiling times.
More RAM.
Someone talked about RAM drives in another answer. I did this with a 80286 and Turbo C++ (shows age) and the results were phenomenal. As was the loss of data when the machine crashed.
You could use Unity Builds.
​​
Use
#pragma once
at the top of header files, so if they're included more than once in a translation unit, the text of the header will only get included and parsed once.
Use forward declarations where you can. If a class declaration only uses a pointer or reference to a type, you can just forward declare it and include the header for the type in the implementation file.
For example:
// T.h
class Class2; // Forward declaration
class T {
public:
void doSomething(Class2 &c2);
private:
Class2 *m_Class2Ptr;
};
// T.cpp
#include "Class2.h"
void Class2::doSomething(Class2 &c2) {
// Whatever you want here
}
Fewer includes means far less work for the preprocessor if you do it enough.
Just for completeness: a build might be slow because the build system is being stupid as well as because the compiler is taking a long time to do its work.
Read Recursive Make Considered Harmful (PDF) for a discussion of this topic in Unix environments.
Not about the compilation time, but about the build time:
Use ccache if you have to rebuild the same files when you are working
on your buildfiles
Use ninja-build instead of make. I am currently compiling a project
with ~100 source files and everything is cached by ccache. make needs
5 minutes, ninja less than 1.
You can generate your ninja files from cmake with -GNinja.
Upgrade your computer
Get a quad core (or a dual-quad system)
Get LOTS of RAM.
Use a RAM drive to drastically reduce file I/O delays. (There are companies that make IDE and SATA RAM drives that act like hard drives).
Then you have all your other typical suggestions
Use precompiled headers if available.
Reduce the amount of coupling between parts of your project. Changing one header file usually shouldn't require recompiling your entire project.
I had an idea about using a RAM drive. It turned out that for my projects it doesn't make that much of a difference after all. But then they are pretty small still. Try it! I'd be interested in hearing how much it helped.
Dynamic linking (.so) can be much much faster than static linking (.a). Especially when you have a slow network drive. This is since you have all of the code in the .a file which needs to be processed and written out. In addition, a much larger executable file needs to be written out to the disk.
Where are you spending your time? Are you CPU bound? Memory bound? Disk bound? Can you use more cores? More RAM? Do you need RAID? Do you simply want to improve the efficiency of your current system?
Under gcc/g++, have you looked at ccache? It can be helpful if you are doing make clean; make a lot.
Starting with Visual Studio 2017 you have the capability to have some compiler metrics about what takes time.
Add those parameters to C/C++ -> Command line (Additional Options) in the project properties window:
/Bt+ /d2cgsummary /d1reportTime
You can have more informations in this post.
Faster hard disks.
Compilers write many (and possibly huge) files to disk. Work with SSD instead of typical hard disk and compilation times are much lower.
On Linux (and maybe some other *NIXes), you can really speed the compilation by NOT STARING at the output and changing to another TTY.
Here is the experiment: printf slows down my program
Networks shares will drastically slow down your build, as the seek latency is high. For something like Boost, it made a huge difference for me, even though our network share drive is pretty fast. Time to compile a toy Boost program went from about 1 minute to 1 second when I switched from a network share to a local SSD.
If you have a multicore processor, both Visual Studio (2005 and later) as well as GCC support multi-processor compiles. It is something to enable if you have the hardware, for sure.
First of all, we have to understand what so different about C++ that sets it apart from other languages.
Some people say it's that C++ has many too features. But hey, there are languages that have a lot more features and they are nowhere near that slow.
Some people say it's the size of a file that matters. Nope, source lines of code don't correlate with compile times.
But wait, how can it be? More lines of code should mean longer compile times, what's the sorcery?
The trick is that a lot of lines of code is hidden in preprocessor directives. Yes. Just one #include can ruin your module's compilation performance.
You see, C++ doesn't have a module system. All *.cpp files are compiled from scratch. So having 1000 *.cpp files means compiling your project a thousand times. You have more than that? Too bad.
That's why C++ developers hesitate to split classes into multiple files. All those headers are tedious to maintain.
So what can we do other than using precompiled headers, merging all the cpp files into one, and keeping the number of headers minimal?
C++20 brings us preliminary support of modules! Eventually, you'll be able to forget about #include and the horrible compile performance that header files bring with them. Touched one file? Recompile only that file! Need to compile a fresh checkout? Compile in seconds rather than minutes and hours.
The C++ community should move to C++20 as soon as possible. C++ compiler developers should put more focus on this, C++ developers should start testing preliminary support in various compilers and use those compilers that support modules. This is the most important moment in C++ history!
Although not a "technique", I couldn't figure out how Win32 projects with many source files compiled faster than my "Hello World" empty project. Thus, I hope this helps someone like it did me.
In Visual Studio, one option to increase compile times is Incremental Linking (/INCREMENTAL). It's incompatible with Link-time Code Generation (/LTCG) so remember to disable incremental linking when doing release builds.
Using dynamic linking instead of static one make you compiler faster that can feel.
If you use t Cmake, active the property:
set(BUILD_SHARED_LIBS ON)
Build Release, using static linking can get more optimize.
From Microsoft: https://devblogs.microsoft.com/cppblog/recommendations-to-speed-c-builds-in-visual-studio/
Specific recommendations include:
DO USE PCH for projects
DO include commonly used system, runtime and third party headers in
PCH
DO include rarely changing project specific headers in PCH
DO NOT include headers that change frequently
DO audit PCH regularly to keep it up to date with product churn
DO USE /MP
DO Remove /Gm in favor of /MP
DO resolve conflict with #import and use /MP
DO USE linker switch /incremental
DO USE linker switch /debug:fastlink
DO consider using a third party build accelerator

How do YOU reduce compile time, and linking time for Visual C++ projects (native C++)?

How do YOU reduce compile time, and linking time for VC++ projects (native C++)?
Please specify if each suggestion applies to debug, release, or both.
It may sound obvious to you, but we try to use forward declarations as much as possible, even if it requires to write out long namespace names the type(s) is/are in:
// Forward declaration stuff
namespace plotter { namespace logic { class Plotter; } }
// Real stuff
namespace plotter {
namespace samples {
class Window {
logic::Plotter * mPlotter;
// ...
};
}
}
It greatly reduces the time for compiling also on others compilers. Indeed it applies to all configurations :)
Use the Handle/Body pattern (also sometimes known as "pimpl", "adapter", "decorator", "bridge" or "wrapper"). By isolating the implementation of your classes into your .cpp files, they need only be compiled once. Most changes do not require changes to the header file so it means you can make fairly extensive changes while only requiring one file to be recompiled. This also encourages refactoring and writing of comments and unit tests since compile time is decreased. Additionally, you automatically separate the concerns of interface and implementation so the interface of your code is simplified.
If you have large complex headers that must be included by most of the .cpp files in your build process, and which are not changed very often, you can precompile them. In a Visual C++ project with a typical configuration, this is simply a matter of including them in stdafx.h. This feature has its detractors, but libraries that make full use of templates tend to have a lot of stuff in headers, and precompiled headers are the simplest way to speed up builds in that case.
These solutions apply to both debug and release, and are focused on a codebase that is already large and cumbersome.
Forward declarations are a common solution.
Distributed building, such as with Incredibuild is a win.
Pushing code from headers down into source files can work. Small classes, constants, enums and so on might start off in a header file simply because it could have been used in multiple compilation units, but in reality they are only used in one, and could be moved to the cpp file.
A solution I haven't read about but have used is to split large headers. If you have a handful of very large headers, take a look at them. They may contain related information, and may also depend on a lot of other headers. Take the elements that have no dependencies on other files...simple structs, constants, enums and forward declarations and move them from the_world.h to the_world_defs.h. You may now find that a lot of your source files can now include only the_world_defs.h and avoid including all that overhead.
Visual Studio also has a "Show Includes" option that can give you a sense of which source files include many headers and which header files are most frequently included.
For very common includes, consider putting them in a pre-compiled header.
I use Unity Builds (Screencast located here).
The compile speed question is interesting enough that Stroustrup has it in his FAQ.
We use Xoreax's Incredibuild to run compilation in parallel across multiple machines.
Also an interesting article from Ned Batchelder: http://nedbatchelder.com/blog/200401/speeding_c_links.html (about C++ on Windows).
Our development machines are all quad-core and we use Visual Studio 2008 supports parallel compiling. I am uncertain as to whether all editions of VS can do this.
We have a solution file with approximately 168 individual projects, and compile this way takes about 25 minutes on our quad-core machines, compared to about 90 minutes on the single core laptops we give to summer students. Not exactly comparable machines but you get the idea :)
With Visual C++, there is a method, some refer to as Unity, that improves link time significantly by reducing the number of object modules.
This involves concatenating the C++ code, usually in groups by library. This of course makes editing the code much more difficult, and you will run into namespace collisions unless you use them well. It keeps you from using "using namespace foo";
Several teams at our company have elaborate systems to take the normal C++ files and concatenate them at compile time as a build step. The reduction in link times can be enormous.
Another useful technique is blobbing. I think it is something similar to what was described by Matt Shaw.
Simply put, you just create one cpp file in which you include other cpp files. You may have two different project configurations, one ordinary and one blob. Of course, blobbing puts some constrains on your code, e.g. class names in unnamed namespaces may clash.
One technique to avoid recompiling the whole code in a blob (as David Rodríguez mentioned) when you change one cpp file - is to have your "working" blob which is created from files modified recently and other ordinary blobs.
We use blobbing at work most of the time, and it reduces project build time, especially link time.
Compile Time:
If you have IncrediBuild, compile time won't be a problem. If you don't have a IncrediBuild, try the "unity build" method. It combine multiple cpp files to a single cpp file so the whole compile time is reduced.
Link Time:
The "unity build" method also contribute to reduce the link time but not much. How ever, you can check if the "Whole global optimization" and "LTCG" are enabled, while these flags make the program fast, they DO make the link SLOW.
Try turning off the "Whole Global Optimization" and set LTCG to "Default" the link time might be reduced by 5/6. (LTCG stands for Link Time Code Generation)