How to reduce release build time in visual studio unmanaged code?

How to reduce release build time in visual studio unmanaged code? - c++

I have a console application written in C/C++. Usually it takes, 5-10 minutes to get compiled on non-windows platforms even though optimization flag is set to -o3. But it takes approximate 1-2 hours to get compiled on Windows platform when optimization flag is set to Full Optimization (/Ox) and Inline Function expansion is set to Any Suitable (/Ob2) in visual studio. This happens in release/debug mode both.
I understand compiler is trying to optimize the code hence it is bound to take more time but isn't it too much time compare to time taken by other compilers(mainly g++) on non-windows platforms.
So far I tried..
Removed unnecessary headers from source and header file, introduced forward declarations wherever possible but no respite.
I analyzed the all header files. Templates are used hardly in 2-3 header files out of ~50 header files in project. These headers are also not widely included in the source files.
I've two observations from this behaviour -
There is nothing terribly wrong in the source code otherwise compilers on non-windows platforms would not be able to finish so quick.
Seems VS compiler genuinely taking more time(1-2 hours) which other compilers are able to do in(10 minutes) but VS compiler can't be that bad. Therefore, I must be missing to change some configuration (apart from optimization).
Does anyone has idea how to find out what is going wrong here ? May be starting point will be to identify compilation time taken by each file. How do I find compilation time of each file ?
Could there be possibility if I can still improve/try something ?
Here are additional details about hardware, source code etc as requested in some of comments
RAM - 8.0 GB RAM
OS - Windows 7 64 bit
Processor - Intel Core i5 2.6 GHz
Visual Studio - 2013 Ultimate
Note - If I disable optimization (set /Od and /Ob0 flags in VS) then program compiles in less than 5 minutes on the same machine.
Source files - approx 55, header and source files each and 80KLOC code.

Does anyone has idea where to start to find out what is going wrong here ?
Not without more details.
Could there be possibility if I can still improve/try something ?
Yes. In particular, consider:
removing any templated code from header files (both included and defined in the .h file), and accessing that code through pimpl (because templates are reevaluated on each pass).
optimizing your usage of precompiled headers
splitting your console application into separate modules (so your build system will only update the dirty binaries on build)

Based on suggestions received in comments, I started by finding out compilation time taken by each file:
Clean project to ensure all *.obj files are removed
Build the project again
Noticed the time stamp for each file and I found a file which was
taking almost two hours to compile.
When I open the source file I see something terribly wrong in the code in contrast to my observation. It is a huge monster file of 27KLOC (Opps!, of course I didn't write this file).
There are 739 instances of a class created dynamically and assigned to an array. Each instance in turn dynamically creates some of its members as well. In shorts thousands of objects are being created in this file.
To ensure that this file is the culprit and VS studio is taking way too much optimizing this file. I disabled the optimization
in this file as proposed by #Predelnik in comments. Voilà! program compiles within couple of minutes now. This source code needs a serious re-factoring.
If someone is facing such problems, I would go as following -
Enable Build-And-Run option and /MP flag. As discussed Here. If there is some problem with the code the parallel projects and file compilations would not help.
Find out if any source file is the culprit as above. I believe the link I found Here is a way to calculate the build time not compile time of each file.

Related

smart, selective C++ build process (compiling/linking) in Visual Studio

I am working with a large C++ project (100K lines), and every time I change a line(s) I need to re-build it. Due to its size, it takes quite a while and that's hitting my performance. So, my question is: if there is any way to speed up the build time in Visual Studio in order to selectively re-compile only those files which actually changed. And thus re-using all the other unchanged compiled files (object files). And then, obviously linking.
I know I should be able to do so directly from the command line as Visual Studio is actually doing it "behind the scenes". Though the "Property Pages" display the Command Lines for compile and link, I think it would get quite troublesome as these commands is so long that I will probably make edit mistakes. Thanks in advance for sharing your experience on how to deal with this kind of situation.

Whole program optimization failing in VC2008

I have a reasonably large C++ program (~11mb exe) compiled under VS2008 and was interested to see if whole program optimization would significantly affect its performance. However, turning on whole program optimization and link time code generation causes the link to fail as follows;
1>c:\cpp\Win32\Atlas\tin\TINDoc.Cpp : fatal error C1083: Cannot open compiler intermediate file: '.\releaseopt\TINDoc.obj': Not enough space
1>LINK : fatal error LNK1257: code generation failed
Looking at task manager, I can see the linker using more and more memory until it runs out and bombs out. The compiler is running on XP 32bit with 2GB or ram and 2gb page file. Is WPO limited to smaller applications and/or bigger environments, or is there any way to get the linker to be a bit more frugal in memory usage.
n.b. already turned of precompiled headers, which was causing the compilation to fail before linking, and turned off output of debug info and anything else that might take extra resources. The help for C1083 suggests missing header files or inadequate file handles rather than lack of space.
Edit: Got it working under VS2010, albeit without precompiled headers, but the performance gains aren't that significant. I'll leave this option alone until I move onto a beefier 64bit platform with a more robust version of VS2010.

VC2008 is a fragile beast. The optimiser just doesn't work for some cases, and looks like you might possibly have one such case.
Examples of "not working" include
De-optimised code (slow!)
Compiler or (more frequently) linker crashes
Obscure compile/link error messages
Incorrect code execution (this one is rare, but not unknown).
Actually, if you look at some of the tricks used by modern optimisers it is pretty amazing that it works as often as it does. The complexity is quite astounding.
Addressing Shane's problem specifically, some things that might cause your error:
Running out of file handles used to be a big problem in DOS and early Windows versions. You hardly ever see it in modern Windows versions. I'm not even certain there is still a limit on file handles.
A compiler bug could be doing an infinite loop, meaning that no amount of resources will be enough.
A recursive include file could also cause something similar. Do all your include files have "#pragma once" or "#if !defined(FOO_INCLUDED)"?
Is it possible you've included the file TINDoc.obj twice in the project? The 2008 compiler is aggressively multi-threaded and there might be contention between two threads trying to access the file. Actually this could happen via a compiler bug, even if you haven't included the file twice.
If the program really is just too big, consider breaking it up into one or more DLLs and building it piecemeal, or splitting the contentious source file into multiple files to compiler in multiple phases.
Don't assume that, because it says "not enough space" that has to be the reason. Some programs (including some compilers) will guess a reason for an error, instead of checking.
You can monitor the use of memory, handles, etc. using task manager or perfmon, or (my preference) Process Explorer
I already posted the first part of this as a comment (above) but I an making it an answer, as suggested by Ben (http://stackoverflow.com/users/587803/ben) -- and expanded it somewhat.

You can try to boot XP with the /3G switch to give the linker a 3GB address space. In Vista/7, use BCDEDIT /Set IncreaseUserVa 3072 in the command line to get the same effect.
This has fixed problems creating .ilk-files for me.

How to optimize building speed in visual studio 2008

Could someone give me tips to increase building speed in visual studio 2008?
I have a large project with many module with full source. Every single time it's built, every files are rebuilt, some of them were not changed. Can i prevent these file to be rebuilt?
I turned the property "Enable Minimum rebuild" /Gm on but the compiler threw this warning
Command line warning D9030 : '/Gm' is incompatible with multiprocessing; ignoring /MP switch
Every tips to increase building speed will help me much.
Thanks,

C++ compile times are a battle for every project above a certain size. Luckily, with so many people writing large C++ projects, there are a variety of solutions:
The classic cheap solution is the "unity build". This just means using #include to put all of your .cpp files into a single file for compilation. "Unity build" has come up in a number of questions here on stackoverflow, here is the most prominent one that I'm aware of. This screencast demonstrates how to set up such a build in Visual Studio.
My understanding is that unity builds are much faster than classic builds because they effectively cache work done by the preprocessor and linker. One drawback to the unity build is that if you touch one cpp file you'll have to recompile your "big" cpp file. You can work around this by breaking the cpp file you're iterating on out of the unity build and compiling it on its own.
Beyond Unity builds, here's a list of my best practices:
Use #include only when neccessary, prefer forward declarations
Use the pimpl idiom to keep class implementation out of commonly included header files. Doing so allows you to add members to an implementation without suffering a long recompile
Make use of precompiled headers (pch) for commonly included header files that change rarely
Make sure that your build system is using all of the cores available on the local hardware
Keep the list of directories the preprocessor has to search minimal, use precise paths in #include statements
Use #pragma once at the top of header files instead of #ifndef __FOO_H #define __FOO_H ... #endif, if you use the #ifndef trick the compiler will have to open the header file each time it is included, #pragma once allows the compiler to be more efficient
If you're doing all that (the unity build will make the biggest difference, in my experience), the last option is distributed building. distcc is the best free solution I'm aware of, incredibuild is the proprietary industry standard. I'm of the opinion that distributed computing is going to be the only way to get great iteration times out of the messy C++ compilation process. If you have access to a reasonably large number of machines (say, 10-20) this is totally worth looking into. I should mention that unity builds and distributed builds are not totally symbiotic because a traditional compile can be split into smaller chunks of work than a unity build. If you want to go distributed, it's probably not worth setting up a unity build.

Enable Minimal Build:/Gm is incompatible with Build with Multiple Processes:/MP{<cores>}
Thus you can only use one of these two at a time.
/Gm > Project's Properties : Configuration Properties > C/C++ > CodeGeneration
or
/MP{n} > Project's Properties : Configuration Properties > C/C++ > Command Line
Also to prevent unnecessary builds (of untouched files) - structure your code properly; Follow this rule:
In the [.h] header files, place only what's needed by the contents of the header file itself;
and all that you need to share across multiple execution units.
The rest goes in the implementation [.c / .cpp] files.

One simple way is to compile in debug mode(aka zero optimizations), this of course is only for internal testing.
you can also use precompiled headers* to speed up processing, or break off 'unchanging' segments into static libs, removing those from the recompile.
*with /MP you need to create the precompiled header before doing multiprocess compilation, as /MP can read but not write according to MSDN

Can you provide more information on the structure of your project and which files are being rebuilt?
Unchanged C++ files may be rebuilt because they include header files that have been changed, in that case /Gm option will not help

Rebuilding all files after changing one header file is a common problem. The solution is to closely examine what #include your header files use and remove all that can be removed. If your code only uses pointers and references to a class, you can replace
#include "foo.h"
with
class Foo;

VS 2008 C++ build output?

Why when I watch the build output from a VC++ project in VS do I see:
1>Compiling...
1>a.cpp
1>b.cpp
1>c.cpp
1>d.cpp
1>e.cpp
[etc...]
1>Generating code...
1>x.cpp
1>y.cpp
[etc...]
The output looks as though several compilation units are being handled before any code is generated. Is this really going on? I'm trying to improve build times, and by using pre-compiled headers, I've gotten great speedups for each ".cpp" file, but there is a relatively long pause during the "Generating Code..." message. I do not have "Whole Program Optimization" nor "Link Time Code Generation" turned on. If this is the case, then why? Why doesn't VC++ compile each ".cpp" individually (which would include the code generation phase)? If this isn't just an illusion of the output, is there cross-compilation-unit optimization potentially going on here? There don't appear to be any compiler options to control that behavior (I know about WPO and LTCG, as mentioned above).
EDIT:
The build log just shows the ".obj" files in the output directory, one per line. There is no indication of "Compiling..." vs. "Generating code..." steps.
EDIT:
I have confirmed that this behavior has nothing to do with the "maximum number of parallel project builds" setting in Tools -> Options -> Projects and Solutions -> Build and Run. Nor is it related to the MSBuild project build output verbosity setting. Indeed if I cancel the build before the "Generating code..." step, none of the ".obj" files will exist for the most recent set of "compiled" files. This implies that the compiler truly is handling multiple translation units together. Why is this?

Compiler architecture
The compiler is not generating code from the source directly, it first compiles it into an intermediate form (see compiler front-end) and then generates the code from the intermediate form, including any optimizations (see compiler back-end).
Visual Studio compiler process spawning
In a Visual Studio build compiler process (cl.exe) is executed to compile multiple source files sharing the same command line options in one command. The compiler first performs "compilation" sequentially for each file (this is most likely front-end), but "Generating code" (probably back-end) is done together for all files once compilation is done with them.
You can confirm this by watching cl.exe with Process Explorer.
Why code generation for multiple files at once
My guess is Code generation being done for multiple files at once is done to make the build process faster, as it includes some things which can be done only once for multiple sources, like instantiating templates - it has no use to instantiate them multiple times, as all instances but one would be discarded anyway.
Whole program optimization
In theory it would be possible to perform some cross-compilation-unit optimization as well at this point, but it is not done - no such optimizations are ever done unless enabled with /LTCG, and with LTCG the whole Code generation is done for the whole program at once (hence the Whole Program Optimization name).
Note: it seems as if WPO is done by linker, as it produces exe from obj files, but this a kind of illusion - the obj files are not real object files, they contain the intermediate representation, and the "linker" is not a real linker, as it is not only linking the existing code, it is generating and optimizing the code as well.

It is neither parallelization nor code optimization.
The long "Generating Code..." phase for multiple source files goes back to VC6. It occurs independent of optimizations settings or available CPUs, even in debug builds with optimizations disabled.
I haven't analyzed in detail, but my observations are: They occur when switching between units with different compile options, or when certain amounts of code has passed the "file-by-file" part. It's also the stage where most compiler crashes occured in VC6 .
Speculation: I've always assumed that it's the "hard part" that is improved by processing multiple items at once, maybe just the code and data loaded in cache. Another possibility is that the single step phase eats memory like crazy and "Generating code" releases that.
To improve build performance:
Buy the best machine you can afford
It is the fastest, cheapest improvement you can make. (unless you already have one).
Move to Windows 7 x64, buy loads of RAM, and an i7 860 or similar. (Moving from a Core2 dual core gave me a factor of 6..8, building on all CPUs.)
(Don't go cheap on the disks, too.)
Split into separate projects for parallel builds
This is where 8 CPUS (even if 4 physical + HT) with loads of RAM come to play. You can enable per-project parallelization with /MP option, but this is incompatible with many other features.

At one time compilation meant parse the source and generate code. Now though, compilation means parse the source and build up a symbolic database representing the code. The database can then be transformed to resolved references between symbols. Later on, the database is used as the source to generate code.
You haven't got optimizations switched on. That will stop the build process from optimizing the generated code (or at least hint that optimizations shouldn't be done... I wouldn't like to guarantee no optimizations are performed). However, the build process is still optimized. So, multiple .cpp files are being batched together to do this.
I'm not sure how the decision is made as to how many .cpp files get batched together. Maybe the compiler starts processing files until it decides the memory size of the database is large enough such that if it grows any more the system will have to start doing excessive paging of data in and out to disk and the performance gains of batching any more .cpp files would be negated.
Anyway, I don't work for the VC compiler team, so can't answer conclusively, but I always assumed it was doing it for this reason.

There's a new write-up on the Visual C++ Blog that details some undocumented switches that can be used to time/profile various stages of the build process (I'm not sure how much, if any, of the write-up applies to versions of MSVC prior to VS2010). Interesting stuff which should provide at least a little insight into what's going on behind the scenes:
http://blogs.msdn.com/vcblog/archive/2010/04/01/vc-tip-get-detailed-build-throughput-diagnostics-using-msbuild-compiler-and-linker.aspx
If nothing else, it lets you know what processes, dlls, and at least some of the phases of translation/processing correspond to which messages you see in normal build output.

It parallelizes the build (or at least the compile) if you have a multicore CPU
edit: I am pretty sure it parallelizes in the same was as "make -j", it compiles multiple cpp files at the same time (since cpp are generally independent) - but obviously links them once.
On my core-2 machine it is showing 2 devenv jobs while compiling a single project.

What are the pros + cons of Link-Time Code Generation? (VS 2005)

I've heard that enabling Link-Time Code Generation (the /LTCG switch) can be a major optimization for large projects with lots of libraries to link together. My team is using it in the Release configuration of our solution, but the long compile-time is a real drag. One change to one file that no other file depends on triggers another 45 seconds of "Generating code...". Release is certainly much faster than Debug, but we might achieve the same speed-up by disabling LTCG and just leaving /O2 on.
Is it worth it to leave /LTCG enabled?

It is hard to say, because that depends mostly on your project - and of course the quality of the LTCG provided by VS2005 (which I don't have enough experience with to judge). In the end, you'll have to measure.
However, I wonder why you have that much problems with the extra duration of the release build. You should only hand out reproducible, stable, versioned binaries that have reproducible or archived sources. I've rarely seen a reason for frequent, incremental release builds.
The recommended setup for a team is this:
Developers typically create only incremental debug builds on their machines. Building a release should be a complete build from source control to redistributable (binaries or even setup), with a new version number and labeling/archiving the sources. Only these should be given to in-house testers / clients.
Ideally, you would move the complete build to a separate machine, or maybe a virtual machine on a good PC. This gives you a stable environment for your builds (includes, 3rd party libraries, environment variables, etc.).
Ideally, these builds should be automated ("one click from source control to setup"), and should run daily.

It allows the linker to do the actual compilation of the code, and therefore it can do more optimization such as inlining.
If you don't use LTCG, the compiler is the only component in the build process that can inline a function, as in replace a "call" to a function with the contents of the function, which is usually a lot faster. The compiler would only do so anyway for functions where this yields an improvement.
It can therefore only do so with functions that it has the body of. This means that if a function in the cpp file calls another function which is not implemented in the same cpp file (or in a header file that is included) then it doesn't have the actual body of the function and can therefore not inline it.
But if you use LTCG, it's the linker that does the inlining, and it has all the functions in all the of the cpp files of the entire project, minus referenced lib files that were not built with LTCG. This gives the linker (which becomes the compiler) a lot more to work with.
But it also makes your build take longer, especially when doing incremental changes. You might want to turn on LTCG in your release build configuration.
Note that LTCG is not the same as profile-guided optimization.

I know the guys at Bungie used it for Halo3, the only con they mentioned was that it sometimes messed up their deterministic replay data.
Have you profiled your code and determined the need for this? We actually run our servers almost entirely in debug mode, but special-case a few files that profiled as performance critical. That's worked great, and has kept things debuggable when there are problems.
Not sure what kind of app you're making, but breaking up data structures to correspond to the way they were processed in code (for better cache coherency) was a much bigger win for us.

I've found the downsides are longer compile times and that the .obj files made in that mode (LTCG turned on) can be really massive. For example, .obj files that might be 200-500k were about 2-3mb. It just to happened to me that compiling a bunch of projects in my chain led to a 2 gb folder, the bulk of which was .obj files.

I also don't see problems with extra compilation time using link-time code generation with the release build. I only build my release version once per day (overnight), and use the unit-test and debug builds during the day.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js