Boost rocks, it is great and extremely powerful, but I hate it everytime I build solution in my Visual Studio 7.1.
It seems Boost has impact on build time (not positive). I cannot remove all Boost usage from my project to compare build times but I tried it on small projects and the difference is meaningful.
I guess the problem is that Boost consists of thousands of header files which include themselves very extensively. So, when I include, say, boost/function.hpp into my header file, it may lead to including hundred of Boost headers.
Is there someone who experienced the same? Any ideas how to solve it?
Rough thoughts:
Add boost to precompiled headers? At least they will be parsed and kept in one file
Do explicit instantination for some Boost templates?
Prepare Boost headers somehow?
Do not include Boost to header files (sounds unreal)
...
PS. Yep, Boost also uses hardcore templating that pretty hard to compiler I guess, so thousands of header files are not the only problem.
I like also boost a lot
Use the precompiled header like you told (that brings most)
When using linked libraries check if you really need them (linking is also quite slow)
Another maybe stupid hint, but was the main source of performance loss on my computer:
check if your antivirus makes an on-access-scan and disable it for the header & source directories (boost and your projects)
Naturally, including boost leads to longer compilations times - just like including any library does. Being (mostly) a template library offcourse leads to quite big performance penalty as all of the logic is implemented in the headers.
I've had good results including (a subset of) boost in precompiled headers. However, I belive that the gain is greatest with MSVC 9. On MSVC 7 I have seen several reports saying that precompiled headers of templates frequently leads to performance penalty. Another crucial aspect determining if you'll see performance gain is to include the appropiate headers in the precompiled header. Only include headers you frequently use, and make sure they are never changed (that is, think three times before including your own headers here)
I do not know if explicit instantination has any effect, even though I doubt it. If anyone has seen any results on this (regardless compiler), it would be very interesting.
"Preparing" boost headers sounds like altering them which sounds like a very bad idea to me. You don't want to end up maintaining customized headers...
May not be so unreal as you might think. Always use as many forward declarations as possible to reduce the "footprint" of each header file. Consider using the Pimpl pattern to avoid including boost headers that are not reflected in the public interface of your class (offtopic: I consider Pimpl to often be unnecessary. Instead I try to slice the classes into smaller pieces, acheiving the same result in a "cleaner" fashion). Don't be afraid to include general, common classes (e.g. shared_ptr) as long as you're consistent in the usage of these classes (if your using them in all your classes you wont see much gain in hiding them away from one header).
Upgrading MSVC (to support parallel builds) will help. However, this is always an issue in C++. To minimize the problem, you need to be very strict and follow guidelines to reduce the footprint of your headers. Now and then you should look through the include-clauses and make sure there are nothing unnecessary included. If you're list of includes done in the header is getting long you're probably doing something wrong - most includes should only be in the cpp.
Including Boost header files only when they are really necessary makes sense. Headers including other headers cause stress to IO and has great impact to compile time. Forward declaration helps to some point, but with Boost it can be real pain.
Using external guards in header files avoids unnecessary loading. Like this:
#ifndef BOOST_SHARED_PTR_HPP_INCLUDED
# include <boost/shared_ptr.hpp>
#endif
Another way to avoid header cascade is to use "pimpl"-idiom, especially when dealing with complex classes. Then complex Boost stuff can be included and used only by that compilation unit. Downside is that interface should be designed so that no Boost specific stuff is required. However, breaking depencies might be a good thing too.
As you mentioned in your post, the boost code contains a lot of template code that require a lot of CPU cycles for compilation. The overhead from multiple header files is very small compared to that.
The first thing you need to do is find out which header file or which line of code is responsible for the delay in compilation. Often it is not the inclusion of the header file, but the usage of one of its classes/functions in your own code that is causing the delay. You can isolate the responsible code by commenting out pieces of your code until compilation is fast again, and then uncommenting your pieces of code until compilation is slow again. Then you can decide whether you want to replace the slow code with something else or not. It's up to you to weigh the pros and cons here or compilation speed vs nifty boost code.
There are a few other things you can do as well:
clean up unneeded include statements, esp in your headers
in your header files, replace includes with forward declarations (where possible)
get a faster computer :D
Related
How do I find which parts of code are taking a long time to compile?
I am already using precompiled headers for all of my headers, and they definitely improve the compilation speed. Nevertheless, whenever I make a change to my C++ source file, compiling it takes a long time (this is CPU/memory-bound, not I/O-bound -- it's all cached). Furthermore, this is not related to the linking portion, just the compilation portion.
I've tried turning on /showIncludes, but of course, since I'm using precompiled headers, nothing is getting included after stdafx.h. So I know it's only the source code that takes a while to compile, but I don't know what part of it.
I've also tried doing a minimal build, but it doesn't help. Neither does /MP, because it's a single source file anyway.
I could try dissecting the source code and figuring out which part is a bottleneck by adding/removing it, but that's a pain and doesn't scale. Furthermore, it's hard to remove something and still let the code compile -- error messages, if any, come back almost immediately.
Is there a better way to figure out what's slowing down the compilation?
Or, if there isn't a way: are there any language constructs (e.g. templates?) that take a lot longer to compile?
What I have in my C++ source code:
Three (relatively large) ATL dialog classes (including the definitions/logic).
They could very well be the cause, but they are the core part of the program anyway, so obviously they need to be recompiled whenever I change them.
Random one-line (or similarly small) utility functions, e.g. a byte-array-to-hex converter
References to (inline) classes found inside my header files. (One of the header files is gigantic, but it uses templates only minimally, and of course it's precompiled. The other one is the TR1 regex -- it's huge, but it's barely used.)
Note:
I'm looking for techniques that I can apply more generally in figuring out the cause of these issues, not specific recommendations for my very particular situation. Hopefully that would be more useful to other people as well.
Two general ways to improve the compilation time :
instead of including headers in headers, use forward declare (include headers only in the source files)
minimize templated code (if you can avoid using templates)
Only these two rules will greatly improve your build time.
You can find more tricks in "Large-Scale C++ Software Design" by Lakos.
For visual studio (I am not sure if it is too old), take a look into this : How should I detect unnecessary #include files in a large C++ project?
Template code generally takes longer to compile.
You could investigate using "compiler firewalls", which reduce the frequency of a .cpp file having to build (they can reduce time to read included files as well because of the forward declarations).
You can also shift time spent doing code generation from the compiler to the linker by using Link-Time Code Generation and/or Whole Program Optimization, though generally you lose time in the long run.
Just a style question...
I'm a lowly indie game dev working by myself, and I developed what I have been told is a 'bad' habit of writing whole classes in my headers. Some of the benefits I know .h/.cpp file combos have are they allow code to be split into compilation chunks that won't need recompiled so long as they remain unchanged. And allows for splitting interface from implementation.
However, neither of those things are of any benefit to me, since I tend to favour having my implementation in a spot where I can easily improve it, change it, read it. And my compile times are nigh instantaneous (2-4 seconds, 15 if I updated SFML or Box2D to their latest versions and they need recompiled too)
Coding like this has been saving me a very noticeable amount of time I think, and since there are less files, my code feels less 'overwhelming' to me.
But in light of that, and in general, is there any compelling reason to follow the "file.cpp" for every "file.h" setup for a small project where compile time and interface/implementation separation are not priorities?
is there any compelling reason to follow the "file.cpp" for every "file.h" setup for a small project where compile time and interface/implementation separation are not priorities?
Nope; there's nothing wrong with defining classes and functions in header files, especially not in small projects where compile times aren't a concern.
For what it's worth, my current, in-progress hobby project has 33 header files and a single .cpp file (not including unit tests). That is largely due to just about everything being a template, though.
If you have a huge software project or if you need to encapsulate your code into a library and actually need the modularity, then it might make sense to split code out of a header file. If you want to hide some implementation detail (e.g., if you have some ugly header that you don't want to include elsewhere in your project--WinAPI headers, for example), then it makes sense to split code into a separate source file to hide those details. Otherwise, it may just be a lot of work for not a lot of gain.
I think both compile time and interface/implementation separation are good reasons. Even if the former is not a problem right now, in most decent-sized projects, it does become a problem.
But, since you asked for other reasons, I think a big one is that it reduces dependencies. The implementation of your class probably requires more #includes than the interface. But if you put the implementation in the header file, you drag along those dependencies with the header. Then every other file that includes that also has those dependencies.
There is no one-size-fits-all rule, though. And some classes (especially "small" ones) are probably best placed entirely in a header.
What you're doing sounds sensible for your situation. Two other potential issues:
one definition rule
testing
If you've got just one cpp file and a couple dozen headers, then you can afford to be careless about the one definition rule (assuming you have include guards). This could bite you if you one day find a need to move to compiling/linking the project as a number of translation units. That might happen in not-so-obvious ways, such as wanting to let other people supply a library that they're to build using a couple of your headers. implicitly (defined inside the class) or explicitly (using the keyword) inline functions will be ok, but beware others. Usual rules for variables etc..
For testing: sometimes it helps to have at least a token .cpp file that includes the .h and gets compiles, just so you get some early warning if the contents of one header can't "stand alone", probably due to a forgetten #include. If you have per-header test .cpp files then that kills two birds etc.. If you don't want testing at that level and are happy to clean up any minor dependency bugs reactively then might as well forget it.
I'm developing a C++ library. It got me thinking of the ways Java and C# handle including different components of the libraries. For example, Java uses "import" to allow use of classes from other packages, while C# simply uses "using" to import entire modules.
My questions is, would it be a good idea to #include everything in the library in one massive include and then just use the using directive to import specific classes and modules? Or would this just be down right crazy?
EDIT:
Good responses so far, here are a few mitigating factors which I feel add to this idea:
1) Internal #includes are kept as normal (short and to the point)
2) The file which includes everything is optionally supplied with the library to those who wish to use it3) You could optionally make the big include file part of the pre-compiled header
You're confusing the purpose of #include statements in C++. They do not behave like import statements in Java or using statements in C#. #include does what it says; namely, loads and parses the entire indicated file as part of the current translation unit. The reason for the separate includes is to not have to spend compilation time parsing the entire standard library in every file. In contrast, the statements you're trying to make #include behave like are merely for programmer organization purposes.
#include is for management of the compilation process; not for separating uses. (In fact, you cannot use seperate headers to enforce seperate uses because to do so would violate the one definition rule)
tl;dr -> No, you shouldn't do that. #include as little as possible. When your project becomes large, you'll thank yourself when you're not waiting many hours to compile your project.
I would personally recommend only including the headers when you need them to explicitly show which functionalities your file requires. At the same time, doing so will prevent you from gaining access to functionalities you might no necessarily want, e.g functions unrelated to the goal of the file. Sure, this is no big deal, but I think that it's easier to maintain and change code when you don't have access to unnecessary functions/classes; it just makes it more straightforward.
I might be downvoted for this, but I think you bring up an interesting idea. It would probably slow down compilation a bit, but I think the concept is neat.
As long as you used using sparingly — only for the namespaces you need — other developers would be able to get an idea of what classes were used in a file by glancing at the top. It wouldn't be as granular as seeing a list of #included files, but is seeing a list of included header files really very useful? I don't think so.
Just make sure that all of the header files all use inclusion guards, of course. :)
As said by #Billy ONeal, the main thing is that #include is a preprocessor directive that causes a "^C, ^V" (copy-paste) of code that leads to a compile time increase.
The best considered policy in C++ is to forward declare all possible classes in ".h" files and just include them in the ".cpp" file. It isolates dependencies, as a C/C++ project will be cascadingly rebuilt if a dependent include file is changed.
Of course M$ compilers and its precompiled headers tend to do the opposite, enclosing to what you suggest. But anyone that tried to port code across those compilers is well aware of how smelly it can go.
Some libraries like Qt make extensive use of forward declarations. Take a look on it to see if you like its taste.
I think it will be confusing. When you write C++ you should avoid making it look like Java or C# (or C :-). I for one would really wonder why you did that.
Supplying an include-all file isn't really that helpful either, as a user could easily create one herself, with the parts of the library actually used. Could then be added to a precompiled header, if one is used.
I heard some people complaining about including the windows header file in a C++ application and using it. They mentioned that it is inefficient. Is this just some urban legend or are there really some real hard facts behind it? In other words, if you believe it is efficient or inefficient please explain how this can be with facts.
I am no C++ Windows programmer guru. It would really be appreciated to have detailed explanations.
*Edit: I want to know at compile time and execution. Sorry for not mentioning it.
windows.h is not a "code library". It's a header file, and doesn't contain any executable code as such (save for macro definitions, but those still aren't compiled - their expansions are, if and when you use them).
As such, looking at it strictly from performance perspective, merely including it has any effect solely on compilation time. That one is rather significant, though - for example, if using Platform SDK headers that come with VS2010, #include <windows.h> expands to ~2.4Mb of code - and all that code has to be parsed and processed by the compiler.
Then again, if you use precompiled headers (and you probably should in this scenario), it wouldn't affect you.
If you precompile it, then the compilation speed difference is barely noticeable. The downside to precompiling, is that you can only have one pre-compiled header per project, so people tend to make a single "precompiled.h" (or "stdafx.h") and include windows.h, boost, stl and everything else they need in there. Of course, that means you end up including windows.h stuff in every .cpp file, not just the ones that need it. That can be a problem in cross-platform applications, but you can get around that by doing all your win32-specific stuff in a static library (that has windows.h pre-compiled) and linking to that in your main executable.
At runtime, the stuff in windows.h is about as bare-metal as you can get in Windows. So there's really no "inefficiencies" in that respect.
I would say that most people doing serious Windows GUI stuff would be using a 3rd-party library (Qt, wxWidgets, MFC, etc) which is typically layered on top of the Win32 stuff defined in windows.h (for the most part), so as I said, on Windows, the stuff in windows.h is basically the bare metal.
There are multiple places where efficiency comes in to play.
Including <windows.h> will substantially increase compile times and bring in many symbols and macros. Some of these symbols or macros may conflict with your code. So from this perspective, if you don't need <windows.h> it would be inefficient at compile time to bring it in.
The increased compile time can be mitigated somewhat by using precompiled headers, but this also brings with it a little more codebase complexity (you need at least 2 more files for the PCH), and some headaches unique to PCHs. Nonetheless, for large Windows project, I usually use a PCH. For toy or utility projects, I typically don't because it's more trouble than it's worth.
Efficiency also comes in to play at runtime. As far as I know, if you #include <windows.h> but don't use any of those facilities, it will have no effect on the runtime behavior of your program at least as far as calling extra code and that kind of thing. There may be other runtime effects however that I'm not aware of.
As far as the big White Elephant question, "Is Windows Efficient?" I'll not go in to that here other than to say this: Using Windows is much like anything else in that how efficient or inefficient it is depends mostly on you and how well you know how to use it. You'll get as many different opinions on this as people you ask however, ranging from "Winblowz sucks" to "I love Windows, it's awesome." Ignore them all. Learn to code in Windows if you need & want to and then make up your own mind.
As has been noted, #including windows.h slows down compile time. You can use precompiled headers or do a good job of isolating the windows calls only to modules that need them to help with that.
Also, you can add these preproc definitions before the windows.h include like so:
#define WIN32_LEAN_AND_MEAN
#define VC_EXTRALEAN
#include <windows.h>
It will reduce the number of definitions from windows.h and sub-included header files. You may find later on that you need to remove the lean-and-mean, but try it first and wait until the compiler complains about a missing def.
The namespace conflicts are a legitimate gripe, but technically have nothing to do with efficiency, unless you count efficiency of your personal use of time. Considering how many thousands of definitions will be thrown into your namespace, conflicts are bound to occur at some point, and that can be severely irritating. Just use the practice of isolating your Windows calls into modules, and you will be fine. For this, put #include windows.h in the .cpp file, and not the .h file.
I see no basis for thinking that the runtime performance of the executable will be impacted by including windows.h. You are only adding a large number of definitions to the context used by the compiler. You aren't even putting all the definitions into your compiled code--just allocations, function calls, and referencing based on any definitions used in your source code (.cpp).
Another argument could be made that the Windows API types and functions are inherently wasteful of resources or perform inefficiently. I.e. if you want to create a file, there is some monstrous structure to pass to the Windows API. Still, I think most of this is penny-wise/pound-foolish thinking. Evaluate Windows API performance problems case-by-case and make replacements for inefficient code where possible and valuable.
In general, including windows.h is a necessity: if you need windows functions, you have to include it. I think what you're refering to is (among other things) nested inclusion of windows.h. That is, you include a .h that includes itself windows.h, and you also include windows.h in your .cpp file. This leads to inefficiencies, of course, so you have to study very well in your code what .h files are included in each .h file, and avoid including, say, windows.h n times indirectly.
Just including the header without using it will not have any effects in runtime efficiency
It would affect compilation time ..
How do YOU reduce compile time, and linking time for VC++ projects (native C++)?
Please specify if each suggestion applies to debug, release, or both.
It may sound obvious to you, but we try to use forward declarations as much as possible, even if it requires to write out long namespace names the type(s) is/are in:
// Forward declaration stuff
namespace plotter { namespace logic { class Plotter; } }
// Real stuff
namespace plotter {
namespace samples {
class Window {
logic::Plotter * mPlotter;
// ...
};
}
}
It greatly reduces the time for compiling also on others compilers. Indeed it applies to all configurations :)
Use the Handle/Body pattern (also sometimes known as "pimpl", "adapter", "decorator", "bridge" or "wrapper"). By isolating the implementation of your classes into your .cpp files, they need only be compiled once. Most changes do not require changes to the header file so it means you can make fairly extensive changes while only requiring one file to be recompiled. This also encourages refactoring and writing of comments and unit tests since compile time is decreased. Additionally, you automatically separate the concerns of interface and implementation so the interface of your code is simplified.
If you have large complex headers that must be included by most of the .cpp files in your build process, and which are not changed very often, you can precompile them. In a Visual C++ project with a typical configuration, this is simply a matter of including them in stdafx.h. This feature has its detractors, but libraries that make full use of templates tend to have a lot of stuff in headers, and precompiled headers are the simplest way to speed up builds in that case.
These solutions apply to both debug and release, and are focused on a codebase that is already large and cumbersome.
Forward declarations are a common solution.
Distributed building, such as with Incredibuild is a win.
Pushing code from headers down into source files can work. Small classes, constants, enums and so on might start off in a header file simply because it could have been used in multiple compilation units, but in reality they are only used in one, and could be moved to the cpp file.
A solution I haven't read about but have used is to split large headers. If you have a handful of very large headers, take a look at them. They may contain related information, and may also depend on a lot of other headers. Take the elements that have no dependencies on other files...simple structs, constants, enums and forward declarations and move them from the_world.h to the_world_defs.h. You may now find that a lot of your source files can now include only the_world_defs.h and avoid including all that overhead.
Visual Studio also has a "Show Includes" option that can give you a sense of which source files include many headers and which header files are most frequently included.
For very common includes, consider putting them in a pre-compiled header.
I use Unity Builds (Screencast located here).
The compile speed question is interesting enough that Stroustrup has it in his FAQ.
We use Xoreax's Incredibuild to run compilation in parallel across multiple machines.
Also an interesting article from Ned Batchelder: http://nedbatchelder.com/blog/200401/speeding_c_links.html (about C++ on Windows).
Our development machines are all quad-core and we use Visual Studio 2008 supports parallel compiling. I am uncertain as to whether all editions of VS can do this.
We have a solution file with approximately 168 individual projects, and compile this way takes about 25 minutes on our quad-core machines, compared to about 90 minutes on the single core laptops we give to summer students. Not exactly comparable machines but you get the idea :)
With Visual C++, there is a method, some refer to as Unity, that improves link time significantly by reducing the number of object modules.
This involves concatenating the C++ code, usually in groups by library. This of course makes editing the code much more difficult, and you will run into namespace collisions unless you use them well. It keeps you from using "using namespace foo";
Several teams at our company have elaborate systems to take the normal C++ files and concatenate them at compile time as a build step. The reduction in link times can be enormous.
Another useful technique is blobbing. I think it is something similar to what was described by Matt Shaw.
Simply put, you just create one cpp file in which you include other cpp files. You may have two different project configurations, one ordinary and one blob. Of course, blobbing puts some constrains on your code, e.g. class names in unnamed namespaces may clash.
One technique to avoid recompiling the whole code in a blob (as David Rodríguez mentioned) when you change one cpp file - is to have your "working" blob which is created from files modified recently and other ordinary blobs.
We use blobbing at work most of the time, and it reduces project build time, especially link time.
Compile Time:
If you have IncrediBuild, compile time won't be a problem. If you don't have a IncrediBuild, try the "unity build" method. It combine multiple cpp files to a single cpp file so the whole compile time is reduced.
Link Time:
The "unity build" method also contribute to reduce the link time but not much. How ever, you can check if the "Whole global optimization" and "LTCG" are enabled, while these flags make the program fast, they DO make the link SLOW.
Try turning off the "Whole Global Optimization" and set LTCG to "Default" the link time might be reduced by 5/6. (LTCG stands for Link Time Code Generation)