Suppose a have a fairly complex class I'm working on. Half the methods are done and tested, but I'm still devolping the other half. If I put the finished code in one cpp and the rest in another, will Visual Studio (or any other IDE for that matter) compile faster when I only change code that's in the "work-in-progress" cpp?
Thanks!
Yes, I believe Visual Studio compiles incrementally, so as long as you hit Build and not Rebuild All you should get faster compile times by splitting out.
However, you should really be splitting out because of code-factoring reasons i.e. each class should have a single purpose etc. etc... I'm sure you know.
It really depends. For a very large project, link time can often be considerably more expensive than the time to compile a single file. In our codebase at work (a game based on the Unreal Engine) we actually found that making "bulk.cpp" files that include many other files (effectively fewer translation units) decreases the turn around time significantly.
Even though individual compile time for a small change was increased, overall compile time (full rebuild) and link time (which happens even for a small change) both decreased dramatically.
As long as the header file doesn't change (assuming both .cpp's include the same header) then only the changed .cpp files will be compiled.
This is true for most IDEs at the very least. I haven't had experience in directly invoking compilers like gcc so I can't comment on that.
The answer is probably yes because you'll probably be performing incremental builds (compiling only the .cpp that changes) and pre-compiled headers. If you are not using either of these features, then you'll get slower builds. I'm pretty sure that a default Visual Studio C++ projects uses both incremental builds and pre-compiled headers.
Yes it will be faster.
But more importantly: Don't worry about this, if your class is so large that it's taking a lot of time on a modern processor it's god's way of saying your class needs to get refactored into smaller pieces.
Related
I once worked on a C++ project that took about an hour and a half for a full rebuild. Small edit, build, test cycles took about 5 to 10 minutes. It was an unproductive nightmare.
What is the worst build times you ever had to handle?
What strategies have you used to improve build times on large projects?
Update:
How much do you think the language used is to blame for the problem? I think C++ is prone to massive dependencies on large projects, which often means even simple changes to the source code can result in a massive rebuild. Which language do you think copes with large project dependency issues best?
Forward declaration
pimpl idiom
Precompiled headers
Parallel compilation (e.g. MPCL add-in for Visual Studio).
Distributed compilation (e.g. Incredibuild for Visual Studio).
Incremental build
Split build in several "projects" so not compile all the code if not needed.
[Later Edit]
8. Buy faster machines.
My strategy is pretty simple - I don't do large projects. The whole thrust of modern computing is away from the giant and monolithic and towards the small and componentised. So when I work on projects, I break things up into libraries and other components that can be built and tested independantly, and which have minimal dependancies on each other. A "full build" in this kind of environment never actually takes place, so there is no problem.
One trick that sometimes helps is to include everything into one .cpp file. Since includes are processed once per file, this can save you a lot of time. (The downside to this is that it makes it impossible for the compiler to parallelize compilation)
You should be able to specify that multiple .cpp files should be compiled in parallel (-j with make on linux, /MP on MSVC - MSVC also has an option to compile multiple projects in parallel. These are separate options, and there's no reason why you shouldn't use both)
In the same vein, distributed builds (Incredibuild, for example), may help take the load off a single system.
SSD disks are supposed to be a big win, although I haven't tested this myself (but a C++ build touches a huge number of files, which can quickly become a bottleneck).
Precompiled headers can help too, when used with care. (They can also hurt you, if they have to be recompiled too often).
And finally, trying to minimize dependencies in the code itself is important. Use the pImpl idiom, use forward declarations, keep the code as modular as possible. In some cases, use of templates may help you decouple classes and minimize dependencies. (In other cases, templates can slow down compilation significantly, of course)
But yes, you're right, this is very much a language thing. I don't know of another language which suffers from the problem to this extent. Most languages have a module system that allows them to eliminate header files, which area huge factor. C has header files, but is such a simple language that compile times are still manageable. C++ gets the worst of both worlds. A big complex language, and a terrible primitive build mechanism that requires a huge amount of code to be parsed again and again.
Multi core compilation. Very fast with 8 cores compiling on the I7.
Incremental linking
External constants
Removed inline methods on C++ classes.
The last two gave us a reduced linking time from around 12 minutes to 1-2 minutes. Note that this is only needed if things have a huge visibility, i.e. seen "everywhere" and if there are many different constants and classes.
Cheers
IncrediBuild
Unity Builds
Incredibuild
Pointer to implementation
forward declarations
compiling "finished" sections of the proejct into dll's
ccache & distcc (for C/C++ projects) -
ccache caches compiled output, using the pre-processed file as the 'key' for finding the output. This is great because pre-processing is pretty quick, and quite often changes that force recompile don't actually change the source for many files. Also, it really speeds up a full re-compile. Also nice is the instance where you can have a shared cache among team members. This means that only the first guy to grab the latest code actually compiles anything.
distcc does distributed compilation across a network of machines. This is only good if you HAVE a network of machines to use for compilation. It goes well with ccache, and only moves the pre-processed source around, so the only thing you have to worry about on the compiler engine systems is that they have the right compiler (no need for headers or your entire source tree to be visible).
The best suggestion is to build makefiles that actually understand dependencies and do not automatically rebuild the world for a small change. But, if a full rebuild takes 90 minutes, and a small rebuild takes 5-10 minutes, odds are good that your build system already does that.
Can the build be done in parallel? Either with multiple cores, or with multiple servers?
Checkin pre-compiled bits for pieces that really are static and do not need to be rebuilt every time. 3rd party tools/libraries that are used, but not altered are a good candidate for this treatment.
Limit the build to a single 'stream' if applicable. The 'full product' might include things like a debug version, or both 32 and 64 bit versions, or may include help files or man pages that are derived/built every time. Removing components that are not necessary for development can dramatically reduce the build time.
Does the build also package the product? Is that really required for development and testing? Does the build incorporate some basic sanity tests that can be skipped?
Finally, you can re-factor the code base to be more modular and to have fewer dependencies. Large Scale C++ Software Design is an excellent reference for learning to decouple large software products into something that is easier to maintain and faster to build.
EDIT: Building on a local filesystem as opposed to a NFS mounted filesystem can also dramatically speed up build times.
Fiddle with the compiler optimisation flags,
use option -j4 for gmake for parallel compilation (multicore or single core)
if you are using clearmake , use winking
we can take out the debug flags..in extreme cases.
Use some powerful servers.
This book Large-Scale C++ Software Design has very good advice I've used in past projects.
Minimize your public API
Minimize inline functions in your API. (Unfortunately this also increases linker requirements).
Maximize forward declarations.
Reduce coupling between code. For instance pass in two integers to a function, for coordinates, instead of your custom Point class that has it's own header file.
Use Incredibuild. But it has some issues sometimes.
Do NOT put code that get exported from two different modules in the SAME header file.
Use the PImple idiom. Mentioned before, but bears repeating.
Use Pre-compiled headers.
Avoid C++/CLI (i.e. managed c++). Linker times are impacted too.
Avoid using a global header file that includes 'everything else' in your API.
Don't put a dependency on a lib file if your code doesn't really need it.
Know the difference between including files with quotes and angle brackets.
Powerful compilation machines and parallel compilers. We also make sure the full build is needed as little as possible. We don't alter the code to make it compile faster.
Efficiency and correctness is more important than compilation speed.
In Visual Studio, you can set number of project to compile at a time. Its default value is 2, increasing that would reduce some time.
This will help if you don't want to mess with the code.
This is the list of things we did for a development under Linux :
As Warrior noted, use parallel builds (make -jN)
We use distributed builds (currently icecream which is very easy to setup), with this we can have tens or processors at a given time. This also has the advantage of giving the builds to the most powerful and less loaded machines.
We use ccache so that when you do a make clean, you don't have to really recompile your sources that didn't change, it's copied from a cache.
Note also that debug builds are usually faster to compile since the compiler doesn't have to make optimisations.
We tried creating proxy classes once.
These are really a simplified version of a class that only includes the public interface, reducing the number of internal dependencies that need to be exposed in the header file. However, they came with a heavy price of spreading each class over several files that all needed to be updated when changes to the class interface were made.
In general large C++ projects that I've worked on that had slow build times were pretty messy, with lots of interdependencies scattered through the code (the same include files used in most cpps, fat interfaces instead of slim ones). In those cases, the slow build time was just a symptom of the larger problem, and a minor symptom at that. Refactoring to make clearer interfaces and break code out into libraries improved the architecture, as well as the build time. When you make a library, it forces you to think about what is an interface and what isn't, which will actually (in my experience) end up improving the code base. If there's no technical reason to have to divide the code, some programmers through the course of maintenance will just throw anything into any header file.
Cătălin Pitiș covered a lot of good things. Other ones we do:
Have a tool that generates reduced Visual Studio .sln files for people working in a specific sub-area of a very large overall project
Cache DLLs and pdbs from when they are built on CI for distribution on developer machines
For CI, make sure that the link machine in particular has lots of memory and high-end drives
Store some expensive-to-regenerate files in source control, even though they could be created as part of the build
Replace Visual Studio's checking of what needs to be relinked by our own script tailored to our circumstances
It's a pet peeve of mine, so even though you already accepted an excellent answer, I'll chime in:
In C++, it's less the language as such, but the language-mandated build model that was great back in the seventies, and the header-heavy libraries.
The only thing that is wrong about Cătălin Pitiș' reply: "buy faster machines" should go first. It is the easyest way with the least impact.
My worst was about 80 minutes on an aging build machine running VC6 on W2K Professional. The same project (with tons of new code) now takes under 6 minutes on a machine with 4 hyperthreaded cores, 8G RAM Win 7 x64 and decent disks. (A similar machine, about 10..20% less processor power, with 4G RAM and Vista x86 takes twice as long)
Strangely, incremental builds are most of the time slower than full rebuuilds now.
Full build is about 2 hours. I try to avoid making modification to the base classes and since my work is mainly on the implementation of these base classes I only need to build small components (couple of minutes).
Create some unit test projects to test individual libraries, so that if you need to edit low level classes that would cause a huge rebuild, you can use TDD to know your new code works before you rebuild the entire app. The John Lakos book as mentioned by Themis has some very practical advice for restructuring your libraries to make this possible.
I'm working on a large solution that has thousands of source files, some of them can have over a thousand includes due to the use of Boost and dependency problems. While compiling in parallel on a 12 core Xeon E5-2690 v2 machine (windows 7) it takes up to 4 hours to rebuild the solution using Waf 1.7.13. What can I do to speed things up?
A few things that come to mind:
Use forward declarations and PIMPL.
Review your code base to see if you have created unnecessary templates when normal classes or functions would have been sufficient.
Try to implement your own templates in terms of non-generic implementations operating on void* where applicable, the templates serving merely as type-safe wrappers (see "Item 42: Use private inheritance judiciously" in "More Effective C++" for a nice example).
Check your compiler's documentation for precompiled headers.
Refactor your application's entire architecture so that there are more internal libraries and a smaller application layer, the goal being that in the long run the libraries become so stable that you don't have to rebuild them all the time.
Experiment with optimisation flags. See if you can turn down optimisation in several specific compilation units where optimisation doesn't make a measurable difference in execution speed or binary size yet signficantly increases compilation times.
If refactoring to remove dependencies is not an option, and the build architecture is so unreliable that you have to clean and rebuild all from time to time (in a perfect world this should never happen), you can:
Improve your machine: use an SSD hard disk and/or lot of RAM. If the project has thousand of source files, you will need to hit the disk a lot of times, with random but cacheable accesses. For 12 core machine with hyper threading I would suggest no less than 24gb of ram, probably better going toward 32. You should measure what is your source code size and how much ram is used by the concurrent 24 compilers you can run in parallel.
Use a compiler cache. I don't know how to setup such environment from windows, and with your build system, so I can't suggest much here.
The culprit is likely the structure of your project itself: the expected time for rebuilding is roughly proportional to the number of source files times the number of headers they include. With strong template based programming, that effort grows with the square of source files.
So, you have exactly two contrary options to get your total compilation time down:
You reduce the number of headers included by each source file. That means you have to avoid any templates that use templates. Try to reduce inter-template dependencies as much as you can.
Program only in headers and only use a single .cpp file which instanciates all the different templates in your application. This avoids any recompilation of a header.
A possible variant of this is to create a precompiled header from all header files in your project, so that precompiling the header takes the bulk of the building time.
Of course, option 2 means that there is no such thing as an incremental build, you will need to recompile the entire project everytime you compile. This can make your project entirely unmaintainable. So, I would strongly suggest to go for option 1. But that requires a different programming style (one that uses templates very sparingly) than your project seems to be written in.
In either case, there is no magic bullet: It is not likely possible to make a significant change to the compilation time without restructuring the entire project.
Some suggestions you could try relatively easily:
Use distributed compilation using IncrediBuild or such
Use "unity" builds, i.e. including several cpp files in a single cpp file and compiling only the unity cpp's (you can have a simple tool building the unity cpp's with given number of cpp files in each). Even though I'm not big fan of this technique due to its shortcomings I have seen it employed quite often in projects with bad physical structure to optimize compile & link times.
Use "#pragma once" in headers
Use precompiled headers
Optimize redundant #include's in header files. I don't know if there's a tool for it, but you could easily build a tool which comments out #includes in headers and tries to compile them to see which ones are not needed.
Profile #include file compilation speeds and focus on addressing the worst performers and the ones that are deepest in the #include chain. I.e. use forward declarations instead of #include's, etc.
Buy more/better hardware
How do I find which parts of code are taking a long time to compile?
I am already using precompiled headers for all of my headers, and they definitely improve the compilation speed. Nevertheless, whenever I make a change to my C++ source file, compiling it takes a long time (this is CPU/memory-bound, not I/O-bound -- it's all cached). Furthermore, this is not related to the linking portion, just the compilation portion.
I've tried turning on /showIncludes, but of course, since I'm using precompiled headers, nothing is getting included after stdafx.h. So I know it's only the source code that takes a while to compile, but I don't know what part of it.
I've also tried doing a minimal build, but it doesn't help. Neither does /MP, because it's a single source file anyway.
I could try dissecting the source code and figuring out which part is a bottleneck by adding/removing it, but that's a pain and doesn't scale. Furthermore, it's hard to remove something and still let the code compile -- error messages, if any, come back almost immediately.
Is there a better way to figure out what's slowing down the compilation?
Or, if there isn't a way: are there any language constructs (e.g. templates?) that take a lot longer to compile?
What I have in my C++ source code:
Three (relatively large) ATL dialog classes (including the definitions/logic).
They could very well be the cause, but they are the core part of the program anyway, so obviously they need to be recompiled whenever I change them.
Random one-line (or similarly small) utility functions, e.g. a byte-array-to-hex converter
References to (inline) classes found inside my header files. (One of the header files is gigantic, but it uses templates only minimally, and of course it's precompiled. The other one is the TR1 regex -- it's huge, but it's barely used.)
Note:
I'm looking for techniques that I can apply more generally in figuring out the cause of these issues, not specific recommendations for my very particular situation. Hopefully that would be more useful to other people as well.
Two general ways to improve the compilation time :
instead of including headers in headers, use forward declare (include headers only in the source files)
minimize templated code (if you can avoid using templates)
Only these two rules will greatly improve your build time.
You can find more tricks in "Large-Scale C++ Software Design" by Lakos.
For visual studio (I am not sure if it is too old), take a look into this : How should I detect unnecessary #include files in a large C++ project?
Template code generally takes longer to compile.
You could investigate using "compiler firewalls", which reduce the frequency of a .cpp file having to build (they can reduce time to read included files as well because of the forward declarations).
You can also shift time spent doing code generation from the compiler to the linker by using Link-Time Code Generation and/or Whole Program Optimization, though generally you lose time in the long run.
Want to improve this post? Provide detailed answers to this question, including citations and an explanation of why your answer is correct. Answers without enough detail may be edited or deleted.
I recently had cause to work with some Visual Studio C++ projects with the usual Debug and Release configurations, but also 'Release All' and 'Debug All', which I had never seen before.
It turns out the author of the projects has a single ALL.cpp which #includes all other .cpp files. The *All configurations just build this one ALL.cpp file. It is of course excluded from the regular configurations, and regular configurations don't build ALL.cpp
I just wondered if this was a common practice? What benefits does it bring? (My first reaction was that it smelled bad.)
What kinds of pitfalls are you likely to encounter with this? One I can think of is if you have anonymous namespaces in your .cpps, they're no longer 'private' to that cpp but now visible in other cpps as well?
All the projects build DLLs, so having data in anonymous namespaces wouldn't be a good idea, right? But functions would be OK?
It's referred to by some (and google-able) as a "Unity Build". It links insanely fast and compiles reasonably quickly as well. It's great for builds you don't need to iterate on, like a release build from a central server, but it isn't necessarily for incremental building.
And it's a PITA to maintain.
EDIT: here's the first google link for more info: http://buffered.io/posts/the-magic-of-unity-builds/
The thing that makes it fast is that the compiler only needs to read in everything once, compile out, then link, rather than doing that for every .cpp file.
Bruce Dawson has a much better write up about this on his blog: http://randomascii.wordpress.com/2014/03/22/make-vc-compiles-fast-through-parallel-compilation/
Unity builds improved build speeds for three main reasons. The first reason is that all of the shared header files only need to be parsed once. Many C++ projects have a lot of header files that are included by most or all CPP files and the redundant parsing of these is the main cost of compilation, especially if you have many short source files. Precompiled header files can help with this cost, but usually there are a lot of header files which are not precompiled.
The next main reason that unity builds improve build speeds is because the compiler is invoked fewer times. There is some startup cost with invoking the compiler.
Finally, the reduction in redundant header parsing means a reduction in redundant code-gen for inlined functions, so the total size of object files is smaller, which makes linking faster.
Unity builds can also give better code-gen.
Unity builds are NOT faster because of reduced disk I/O. I have profiled many builds with xperf and I know what I'm talking about. If you have sufficient memory then the OS disk cache will avoid the redundant I/O - subsequent reads of a header will come from the OS disk cache. If you don't have enough memory then unity builds could even make build times worse by causing the compiler's memory footprint to exceed available memory and get paged out.
Disk I/O is expensive, which is why all operating systems aggressively cache data in order to avoid redundant disk I/O.
I wonder if that ALL.cpp is attempting to put the entire project within a single compilation unit, to improve the ability for the compiler to optimize the program for size?
Normally some optimizations are only performed within distinct compilation units, such as removal of duplicate code and inlining.
That said, I seem to remember that recent compilers (Microsoft's, Intel's, but I don't think this includes GCC) can do this optimization across multiple compilation units, so I suspect that this 'trick' is unneccessary.
That said, it would be curious to see if there is indeed any difference.
I agree with Bruce; from my experience I had tried implementing the Unity Build for one of my .dll projects which had a ton of header includes and lots of .cpps; to bring down the overall Compilation time on the VS2010(had already exhausted the Incremental Build options) but rather than cutting down the Compilation time, I ran out of memory and the Build not even able to finish the Compilation.
However to add; I did find enabling the Multi-Processor Compilation option in Visual Studio quite helps in cutting down the Compilation time; I am not sure if such an option is available across other platform compilers.
In addition to Bruce Dawson's excellent answer the following link brings more insights onto the pros and cons of unity builds - https://github.com/onqtam/ucm#unity-builds
We have a multi platform project for MacOS and Windows. We have lots of large templates and many headers to include. We halfed build time with Visual Studio 2017 msvc using precompiled headers. For MacOS we tried several days to reduce build time but precompiled headers only reduced build time to 85%. Using a single .cpp-File was the breakthrough. We simply create this ALL.cpp with a script. Our ALL.cpp contains a list of includes of all .cpp-Files of our project. We reduce build time to 30% with XCode 9.0. The only drawback was we had to give names to all private unamed compilation unit namespaces in .cpp-Files. Otherwise the local names will clash.
How do YOU reduce compile time, and linking time for VC++ projects (native C++)?
Please specify if each suggestion applies to debug, release, or both.
It may sound obvious to you, but we try to use forward declarations as much as possible, even if it requires to write out long namespace names the type(s) is/are in:
// Forward declaration stuff
namespace plotter { namespace logic { class Plotter; } }
// Real stuff
namespace plotter {
namespace samples {
class Window {
logic::Plotter * mPlotter;
// ...
};
}
}
It greatly reduces the time for compiling also on others compilers. Indeed it applies to all configurations :)
Use the Handle/Body pattern (also sometimes known as "pimpl", "adapter", "decorator", "bridge" or "wrapper"). By isolating the implementation of your classes into your .cpp files, they need only be compiled once. Most changes do not require changes to the header file so it means you can make fairly extensive changes while only requiring one file to be recompiled. This also encourages refactoring and writing of comments and unit tests since compile time is decreased. Additionally, you automatically separate the concerns of interface and implementation so the interface of your code is simplified.
If you have large complex headers that must be included by most of the .cpp files in your build process, and which are not changed very often, you can precompile them. In a Visual C++ project with a typical configuration, this is simply a matter of including them in stdafx.h. This feature has its detractors, but libraries that make full use of templates tend to have a lot of stuff in headers, and precompiled headers are the simplest way to speed up builds in that case.
These solutions apply to both debug and release, and are focused on a codebase that is already large and cumbersome.
Forward declarations are a common solution.
Distributed building, such as with Incredibuild is a win.
Pushing code from headers down into source files can work. Small classes, constants, enums and so on might start off in a header file simply because it could have been used in multiple compilation units, but in reality they are only used in one, and could be moved to the cpp file.
A solution I haven't read about but have used is to split large headers. If you have a handful of very large headers, take a look at them. They may contain related information, and may also depend on a lot of other headers. Take the elements that have no dependencies on other files...simple structs, constants, enums and forward declarations and move them from the_world.h to the_world_defs.h. You may now find that a lot of your source files can now include only the_world_defs.h and avoid including all that overhead.
Visual Studio also has a "Show Includes" option that can give you a sense of which source files include many headers and which header files are most frequently included.
For very common includes, consider putting them in a pre-compiled header.
I use Unity Builds (Screencast located here).
The compile speed question is interesting enough that Stroustrup has it in his FAQ.
We use Xoreax's Incredibuild to run compilation in parallel across multiple machines.
Also an interesting article from Ned Batchelder: http://nedbatchelder.com/blog/200401/speeding_c_links.html (about C++ on Windows).
Our development machines are all quad-core and we use Visual Studio 2008 supports parallel compiling. I am uncertain as to whether all editions of VS can do this.
We have a solution file with approximately 168 individual projects, and compile this way takes about 25 minutes on our quad-core machines, compared to about 90 minutes on the single core laptops we give to summer students. Not exactly comparable machines but you get the idea :)
With Visual C++, there is a method, some refer to as Unity, that improves link time significantly by reducing the number of object modules.
This involves concatenating the C++ code, usually in groups by library. This of course makes editing the code much more difficult, and you will run into namespace collisions unless you use them well. It keeps you from using "using namespace foo";
Several teams at our company have elaborate systems to take the normal C++ files and concatenate them at compile time as a build step. The reduction in link times can be enormous.
Another useful technique is blobbing. I think it is something similar to what was described by Matt Shaw.
Simply put, you just create one cpp file in which you include other cpp files. You may have two different project configurations, one ordinary and one blob. Of course, blobbing puts some constrains on your code, e.g. class names in unnamed namespaces may clash.
One technique to avoid recompiling the whole code in a blob (as David Rodríguez mentioned) when you change one cpp file - is to have your "working" blob which is created from files modified recently and other ordinary blobs.
We use blobbing at work most of the time, and it reduces project build time, especially link time.
Compile Time:
If you have IncrediBuild, compile time won't be a problem. If you don't have a IncrediBuild, try the "unity build" method. It combine multiple cpp files to a single cpp file so the whole compile time is reduced.
Link Time:
The "unity build" method also contribute to reduce the link time but not much. How ever, you can check if the "Whole global optimization" and "LTCG" are enabled, while these flags make the program fast, they DO make the link SLOW.
Try turning off the "Whole Global Optimization" and set LTCG to "Default" the link time might be reduced by 5/6. (LTCG stands for Link Time Code Generation)