Compiling Boost.Test tests faster - c++

I am using xcode (gcc) to compile my boost test suite and it takes too long.
The tests are minimal dummy tests, yet it takes several seconds (about 20) to compile them:
#include "boost/test/included/unit_test.hpp"
BOOST_AUTO_TEST_CASE(dummy)
{
BOOST_CHECK_EQUAL(2+2, 4);
}
BOOST_AUTO_TEST_CASE(dummyFail)
{
BOOST_CHECK_EQUAL(2+3, 4);
}
The manual suggests using the library version to speed up compilation. However, I am concerned this might not work - xcode already rebuilds my tests only. The whole framework isn't compiled again since the object files exist.
I guess it's the amount of header files and templates in Boost.Test that are responsible for most of the compilation time.
Do you have an idea of how to compile significantly faster? Would using it as library work? Would including only parts of boost.test work?
Any help is greatly appreciated!

The reason it's slow to compile is because boost/test/included/unit_test.hpp is huge. Using a library makes it faster because the huge header is compiled when the library is built and not thereafter. Your tests then include a smaller set of headers, leading to shorter build times.
Because I'm too lazy to build the library, an alternative I've used is to have one source file (which never changes, and so is rarely rebuilt) include the full boost test, and then have all the real test sources include just boost/test/unit_test.hpp. That gives most of the benefits of using the library.

Try using precompiled headers, this should reduce compilation time. Details can be found here:
http://www.boost.org/boost-build2/doc/html/bbv2/reference/precompiled_headers.html

I believe all the options are now described in the official documentation (see Usage variants).
The Static library usage variant is very convenient, and greatly reduces compilation times.
As described there, one can create a single source file including just two lines, compile that separately and link that in with the other tests.
A comment regarding the linked docs.
I believe that there is an error in that page, namely here:
One and only one translation unit should include following lines:
#define BOOST_TEST_MODULE test module name
#include <boost/test/unit_test.hpp>
This leads to "undefined reference" errors in the linking phase.
I believe it should be instead:
#define BOOST_TEST_MODULE test module name
#include <boost/test/included/unit_test.hpp>

Related

How to structure a "library" of C++ source?

I'm developing a collection of C++ classes and am struggling with how to share the code in a way that maintains organization without compromising ease of compilation for a user of the collection.
Options that I have seen include:
Distribute compiled library file
Put the source in the header file (with implicit inline as discussed in this answer)
Use symbolic links to allow the compiler to find the files.
I'm currently using the third option where, for each class the I want to include I symbolic link each classess headers and source files (e.g. ln -s <path_to_class folder>/myclass.cpp) This works well except that I can't move the project folder location (it breaks all the symlinks) and I have to have all those symlinked files hanging around.
I like the second option (it has the appearance of Java), but I'm worried about code size bloat if everything is declared inline.
A user of the collection will create a project folder somewhere, and somehow include the collection into their compilation process.
I'd like a few things to be possible:
Easy compilation (something like gcc *.cpp from the project folder)
Easy distribution of library in uncompiled form.
Library organization by module.
Compiled code size is not bloated.
I'm not worried about documentation (Doxygen takes care of that) or compile time: the overall modules are small and even the largest projects on the slowest machines won't take more than a few seconds to compile.
I'm using the GCC compiler, if it makes any difference.
A library is the best option (in my opinion) of the three you raised. Then provide the header file(s) in the include path and the library in the linker path.
Since you also want to distribute the library in source code form, I would be inclined to provide a compressed archive (gzip, 7-zip, tarball, or other preferred format) in a central repository.
If I understand correctly, you do not want users to have to include the .cpp files in their build, but instead just want them to use either: (i) the headers directly, (ii) use a compiled form of the lib.
Your requirements are a bit unusual, but they can be achieved. It seems to me like you could organize your code in the following manner. First, have a global define that dictates whether or not you are compiling the library:
// global.h
// ...
#define LIB_SOURCE
// ...
Then in every header file, you check whether that define is set: if the library is distributed as a static/shared lib, the definitions are not included, otherwise, the '.cpp' file is included from the header file.
// A.h
#ifndef _A_H
#include "global.h"
#ifdef LIB_SOURCE
#include "A.cpp"
#endif
// ...
#endif
where 'A.cpp' would contain the actual implementation.
Again, this is a very strange way of doing things and I would actually advise against such practice. A better way (but one which requires more work) is to always distribute a shared library. But to keep things independent of the compiler, write a C layer around it. This way, you have a portable, maintainable library.
As for some of the other requirements:
Keep the build process simple by providing a Makefile
If you worry about the code size of the compiled library, look into gcc's optimization options (-Os). If you worry about the code size of the library when distributed in source-form in the headers, this is more tricky. Since the (inlined) code will actually be in the headers, the code will obviously grow with each inclusion in a .cpp file by the user.
I ended up using inline headers for all of the code. You can see the library here:
https://github.com/libpropeller/libpropeller/tree/master/libpropeller
The library is structured as:
library folder
class A
classA.h
classA.test.h
class B
classB.h
classB.test.h
class C
...
With this structure I can distribute the library as source, and all the user has to do is include -I/path/to/library in their makefile, and #include "library/classA/classA.h" in their source files.
And, as it turns out, having inline headers actually reduces the code size. I've done a full analysis of this, and it turns out that inline code in the headers allows the compiler to make the final binary roughly 5% smaller.

Can I link multiple BOOST unit tests into a single test binary?

I've recently started trying to put a venerable and large (>1 million lines) program under test. There are currently no unit tests. Also, the program is linked as each individual file being linked together - there are no component libraries. Furthermore, the objects are highly interdependent, and it is difficult (impossible?) to link to any object files without linking to at least half of them.
Yes, I know, my life sucks.
I'd like to do some refactoring (obviously), but I'd like to have some tests in place before I start moving things around. My current idea is to compile a single "test program" which runs all of the tests I create. This would drastically simplify the linking issues that I have and let me focus on the real problems. So I have two questions:
Is it possible to link multiple BOOST unit test files into one test executable?
Is there a better solution?
I guess, this is precisely how to use boost test.
I would keep one short main.cpp file consisting of literally 2 lines:
#define BOOST_TEST_MODULE "C++ Unit Tests for MyTangledLibrary"
#include <boost/test/included/unit_test.hpp>
And then I would keep adding test module *.cpp files compiled together into one executable
#include <boost/test/unit_test.hpp>
<< your include files >>
BOOST_AUTO_TEST_SUITE(FancyShmancyLogic)
BOOST_AUTO_TEST_CASE(TestingIf2x3equals6)
{
...
}
BOOST_AUTO_TEST_CASE(TestingIf2x2equals4)
{
...
}
BOOST_AUTO_TEST_SUITE_END()
Yes, you will be able to compile that main.cpp and all of your modules into one large executable.

precompiled header files usage for library builders

According to this answer boost and STL headers belong into the precompiled header file (stdafx.h in the MSVC world). So I changed the headers of my dynamic link library project and moved all STL/Boost headers into the stdafx.h of my project.
Before
#include <boost/smart_ptr.hpp>
namespace XXX
{
class CLASS_DECL_BK CExampleClass // CLASS_DECL_BK is just a standard dll import/export macro
{
private:
boost::scoped_ptr<Replica> m_replica;
}
}
After
namespace XXX
{
class CLASS_DECL_BK CExampleClass
{
private:
boost::scoped_ptr<Replica> m_replica;
}
}
Now I have the advantage of decreased compile times, but all the users of my library are getting build errors (e.g. unknown boost::scoped_ptr...) because of the missing includes (which are now moved to my stdafx.h).
What could be a solution for this dilemma?
I want reduced compile times and compile errors after including my headers files are not acceptable for any users of the dll.
Could this help?
leave all includes directives as they are but duplicate them in my 'stdafx.h'? Since the stdafx.h is always included first inside any cpp file of my project I should be fine, and the users won't get any errors. Or do I loose the speed advantage if multiple includes of the same header occur in one translation unit (got header guards)?
Thanks for any hints!
You should get nearly the same speed increase when you leave the header-includes in place in the library headers and just additionally put them into stdafx.h.
Alternatively, you could add an additional define (a bulk external include guard)
// stdafx.h
#define MY_LIB_STD_HEADERS_ALREADY_INCLUDED
// library_file.h
#ifndef MY_LIB_STD_HEADERS_ALREADY_INCLUDED
#include <boost/smart_ptr.hpp>
...
#endif
But I would only do that if you are sure it helps. Just take a stopwatch and run a few re-compilations. (No need to link.) Then you'll see if there's any difference.
Aside
I'm not sure if adding all boost headers that are needed somewhere in the project is such a good idea. I'd say shared_ptrand friends, boost/foreach, maybe Boost.Format, ... are a good idea, but I'd already think twice for the Boost.RegExp headers. Note: I did not do any speed measurements, but I dimly remember a problem with the size of the pch file and some compiler hiccup. I really should do some tests.
Also check if the Boost Library in question provides forwarding headers and whether you should include them instead. Bloating the precompiled header file can have it's downsides.
you could create a build configuration for this purpose (Debug, Release, CheckDependencies). a simple way to change this would be to use the preprocessor to include/exclude the includes based on the current configuration. using this, you can test and build using debug or release (which contains the larger set of includes), then build all configurations before distribution.
to clarify, the conditional include MON_LIBRARY_VALIDATE_DEPENDENCIES is not to be used in the library headers or sources, only in the precompiled header:
// my pch:
#include <file1.hpp>
#include <file2.hpp>
// ...
#if !defined(MON_LIBRARY_VALIDATE_DEPENDENCIES)
#include <boost/stuff.hpp>
// ...
#endif
then you would append MON_LIBRARY_VALIDATE_DEPENDENCIES to the list of preprocessor definitions in the CheckDependencies configuration.
regarding guards: it should not be a problem if you are using guards under normal circumstances - compilers use optimizations to detect these cases which means they can avoid opening the file in many cases if it's been multiply included and guarded correctly. in fact, attempts to outsmart the compiler in this arena can actually slow down to process. i'd say just leave it as typical unless your library/dependencies are huge and you really have noticeable problems.
Making your compilation units selfcontained (let them include everything they use) is very desirable. This will prevent compilation errors from others that do not use precompiled headers, and as you assume, header guards will keep the cost of these extra includes minimal.
This will also have the desirable side effect that a glance at the headers will tell the users which other headers are in use in the unit, and the options of doing a compilation of the single unit without any fuzz.

Handling stdafx.h in cross-platform code

I have a Visual Studio C++ based program that uses pre-compiled headers (stdafx.h). Now we are porting the application to Linux using gcc 4.x.
The question is how to handle pre-compiled header in both environments.
I've googled but can not come to a conclusion.
Obviously I want leave stdafx.h in Visual Studio since the code base is pretty big and pre-compiled headers boost compilation time.
But the question is what to do in Linux. This is what I found:
Leave the stdafx.h as is. gcc compiles code considerable faster than VC++ (or it is just my Linux machine is stronger ... :) ), so I maybe happy with this option.
Use approach from here - make stdafx.h look like (set USE_PRECOMPILED_HEADER for VS only):
#ifdef USE_PRECOMPILED_HEADER
... my stuff
#endif
Use the approach from here - compile VC++ with /FI to implicitly include stdafx.h in each cpp file. Therefore in VS your code can be switched easily to be compiled without pre-compiled headers and no code will have to be changed.
I personally dislike dependencies and the mess stdafx.h is pushing a big code base towards. Therefore the option is appealing to me - on Linux you don't have stdafx.h, while still being able to turn on pre-compiled headers on VS by /FI only.
On Linux compile stdafx.h only as a precompiled header (mimic Visual Studio)
Your opinion? Are there other approaches to treat the issue?
You're best off using precompiled headers still for fastest compilation.
You can use precompiled headers in gcc as well. See here.
The compiled precompiled header will have an extension appended as .gch instead of .pch.
So for example if you precompile stdafx.h you will have a precompiled header that will be automatically searched for called stdafx.h.gch anytime you include stdafx.h
Example:
stdafx.h:
#include <string>
#include <stdio.h>
a.cpp:
#include "stdafx.h"
int main(int argc, char**argv)
{
std::string s = "Hi";
return 0;
}
Then compile as:
> g++ -c stdafx.h -o stdafx.h.gch
> g++ a.cpp
> ./a.out
Your compilation will work even if you remove stdafx.h after step 1.
I used option 3 last time I needed to do this same thing. My project was pretty small but this worked wonderfully.
I'd either go for option 4 or option 2. I've experimented with precompiled headers on both various VS versions and GCC on Linux (blog posts about this here and here). In my experience, VS is a lot more sensitive to the length of the include paths, number of directories in the include path and the number of include files than G++ is. When I measured build times properly arranged precompiled headers would make a massive difference to the compile time under VS whereas G++ was pretty much unimpressed by this.
Actually, based on the above what I did the last time I worked on a project where this was necessary to rein in the compile time was to precompile the equivalent of stdafx.h under Windows where it made sense and simply used it as a regular file under Linux.
Very simple solution.
Add a dummy file entry for "stdafx.h" in Linux environment.
I would only use option 1 in a big team of developers.
Options 2, 3, and 4 will often halt the productivity of other members of your team, so you can save a few minutes a day in compile time.
Here's why:
Let's assume that half of your developers use VS and half use gcc.
Every now and then some VS developer will forget to include a header in a .cpp file.
He won't notice, because the stdafx.h implicitly includes it. So, he pushes his changes in the version control, and then a few other members of the gcc team will get compiler errors.
So, for every 5 minutes-a-day you gain by using precompiled headers, 5 other people waste by fixing your missing headers.
If you don't share the same code across all of your compilers, you will run into problems like that every day. If you force your VS developers to check for compilation on gcc before pushing changes, then you will throw away all your productivity gains from using precompiled headers.
Option 4 sounds appealing, but what if you want to use another compiler at some point in time ? Option 4 only works if you only use VS and gcc.
Notice that option 1 may make gcc compilation suffer a few seconds. Although it may not be noticeable.
It's simple, really:
Project->Project Settings (Alt + F7)
Project-Settings-Dialog:
C++ -> Category: Precompiled Headers -> Precompiled Headers radio buttons --> disable
Since stdafx.h is by default all the Windows-specific stuff, I've put an empty stdafx.h on my other platform. That way your source code stays identical, while effectively disabling stdafx on Linux without having to remove all the #include "stdafx.h" lines from your code.
If you are using CMake in your project, then there are modules which automate it for you, very convenient, for example see cmake-precompiled-header here. To use it just include the module and call:
include( cmake-precompiled-header/PrecompiledHeader.cmake )
add_precompiled_header( ${target} ${header} FORCEINCLUDE SOURCE_CXX ${source} )
Another module called Cotire creates the header file to be precompiled (no need to manually write StdAfx.h) and speeds up builds in other ways - see here.
I've done both option 2 (#ifdef) and option 4 (PCH for gcc) for cross platform code with no issues.
I find gcc compiles much faster than VS so the precompiled headers are generally not that important, unless you are referencing some huge header file.
I have a situation where #2 in particular didn't work for me (There are numerous VS build configs where a #ifdef around #include "stdafx.h" does not work). Other solutions were suboptimal because the files themselves were cross-project as well as being cross-platform. I did not want to force preprocessor macros to be set or force linux or even windows builds to use (or not use) pch, so...
What I did, given a file named notificationEngine.cpp, for example, was removed the #include stdafx.h line entirely, created a new file in the same directory called pchNotificationEngine.cpp with the following contents:
#include "stdafx.h"
#include "notificationEngine.cpp"
Any given project can just include the correct version of the file. This admittedly is probably not the best option for cpp files that are only used by a single project.

How should I detect unnecessary #include files in a large C++ project?

I am working on a large C++ project in Visual Studio 2008, and there are a lot of files with unnecessary #include directives. Sometimes the #includes are just artifacts and everything will compile fine with them removed, and in other cases classes could be forward declared and the #include could be moved to the .cpp file. Are there any good tools for detecting both of these cases?
While it won't reveal unneeded include files, Visual studio has a setting /showIncludes (right click on a .cpp file, Properties->C/C++->Advanced) that will output a tree of all included files at compile time. This can help in identifying files that shouldn't need to be included.
You can also take a look at the pimpl idiom to let you get away with fewer header file dependencies to make it easier to see the cruft that you can remove.
PC Lint works quite well for this, and it finds all sorts of other goofy problems for you too. It has command line options that can be used to create External Tools in Visual Studio, but I've found that the Visual Lint addin is easier to work with. Even the free version of Visual Lint helps. But give PC-Lint a shot. Configuring it so it doesn't give you too many warnings takes a bit of time, but you'll be amazed at what it turns up.
There's a new Clang-based tool, include-what-you-use, that aims to do this.
!!DISCLAIMER!! I work on a commercial static analysis tool (not PC Lint). !!DISCLAIMER!!
There are several issues with a simple non parsing approach:
1) Overload Sets:
It's possible that an overloaded function has declarations that come from different files. It might be that removing one header file results in a different overload being chosen rather than a compile error! The result will be a silent change in semantics that may be very difficult to track down afterwards.
2) Template specializations:
Similar to the overload example, if you have partial or explicit specializations for a template you want them all to be visible when the template is used. It might be that specializations for the primary template are in different header files. Removing the header with the specialization will not cause a compile error, but may result in undefined behaviour if that specialization would have been selected. (See: Visibility of template specialization of C++ function)
As pointed out by 'msalters', performing a full analysis of the code also allows for analysis of class usage. By checking how a class is used though a specific path of files, it is possible that the definition of the class (and therefore all of its dependnecies) can be removed completely or at least moved to a level closer to the main source in the include tree.
I don't know of any such tools, and I have thought about writing one in the past, but it turns out that this is a difficult problem to solve.
Say your source file includes a.h and b.h; a.h contains #define USE_FEATURE_X and b.h uses #ifdef USE_FEATURE_X. If #include "a.h" is commented out, your file may still compile, but may not do what you expect. Detecting this programatically is non-trivial.
Whatever tool does this would need to know your build environment as well. If a.h looks like:
#if defined( WINNT )
#define USE_FEATURE_X
#endif
Then USE_FEATURE_X is only defined if WINNT is defined, so the tool would need to know what directives are generated by the compiler itself as well as which ones are specified in the compile command rather than in a header file.
Like Timmermans, I'm not familiar with any tools for this. But I have known programmers who wrote a Perl (or Python) script to try commenting out each include line one at a time and then compile each file.
It appears that now Eric Raymond has a tool for this.
Google's cpplint.py has an "include what you use" rule (among many others), but as far as I can tell, no "include only what you use." Even so, it can be useful.
If you're interested in this topic in general, you might want to check out Lakos' Large Scale C++ Software Design. It's a bit dated, but goes into lots of "physical design" issues like finding the absolute minimum of headers that need to be included. I haven't really seen this sort of thing discussed anywhere else.
Give Include Manager a try. It integrates easily in Visual Studio and visualizes your include paths which helps you to find unnecessary stuff.
Internally it uses Graphviz but there are many more cool features. And although it is a commercial product it has a very low price.
You can build an include graph using C/C++ Include File Dependencies Watcher, and find unneeded includes visually.
If your header files generally start with
#ifndef __SOMEHEADER_H__
#define __SOMEHEADER_H__
// header contents
#endif
(as opposed to using #pragma once) you could change that to:
#ifndef __SOMEHEADER_H__
#define __SOMEHEADER_H__
// header contents
#else
#pragma message("Someheader.h superfluously included")
#endif
And since the compiler outputs the name of the cpp file being compiled, that would let you know at least which cpp file is causing the header to be brought in multiple times.
PC-Lint can indeed do this. One easy way to do this is to configure it to detect just unused include files and ignore all other issues. This is pretty straightforward - to enable just message 766 ("Header file not used in module"), just include the options -w0 +e766 on the command line.
The same approach can also be used with related messages such as 964 ("Header file not directly used in module") and 966 ("Indirectly included header file not used in module").
FWIW I wrote about this in more detail in a blog post last week at http://www.riverblade.co.uk/blog.php?archive=2008_09_01_archive.xml#3575027665614976318.
Adding one or both of the following #defines
will exclude often unnecessary header files and
may substantially improve
compile times especially if the code that is not using Windows API functions.
#define WIN32_LEAN_AND_MEAN
#define VC_EXTRALEAN
See http://support.microsoft.com/kb/166474
If you are looking to remove unnecessary #include files in order to decrease build times, your time and money might be better spent parallelizing your build process using cl.exe /MP, make -j, Xoreax IncrediBuild, distcc/icecream, etc.
Of course, if you already have a parallel build process and you're still trying to speed it up, then by all means clean up your #include directives and remove those unnecessary dependencies.
Start with each include file, and ensure that each include file only includes what is necessary to compile itself. Any include files that are then missing for the C++ files, can be added to the C++ files themselves.
For each include and source file, comment out each include file one at a time and see if it compiles.
It is also a good idea to sort the include files alphabetically, and where this is not possible, add a comment.
If you aren't already, using a precompiled header to include everything that you're not going to change (platform headers, external SDK headers, or static already completed pieces of your project) will make a huge difference in build times.
http://msdn.microsoft.com/en-us/library/szfdksca(VS.71).aspx
Also, although it may be too late for your project, organizing your project into sections and not lumping all local headers to one big main header is a good practice, although it takes a little extra work.
If you would work with Eclipse CDT you could try out http://includator.com to optimize your include structure. However, Includator might not know enough about VC++'s predefined includes and setting up CDT to use VC++ with correct includes is not built into CDT yet.
The latest Jetbrains IDE, CLion, automatically shows (in gray) the includes that are not used in the current file.
It is also possible to have the list of all the unused includes (and also functions, methods, etc...) from the IDE.
Some of the existing answers state that it's hard. That's indeed true, because you need a full compiler to detect the cases in which a forward declaration would be appropriate. You cant parse C++ without knowing what the symbols mean; the grammar is simply too ambiguous for that. You must know whether a certain name names a class (could be forward-declared) or a variable (can't). Also, you need to be namespace-aware.
Maybe a little late, but I once found a WebKit perl script that did just what you wanted. It'll need some adapting I believe (I'm not well versed in perl), but it should do the trick:
http://trac.webkit.org/browser/branches/old/safari-3-2-branch/WebKitTools/Scripts/find-extra-includes
(this is an old branch because trunk doesn't have the file anymore)
If there's a particular header that you think isn't needed anymore (say
string.h), you can comment out that include then put this below all the
includes:
#ifdef _STRING_H_
# error string.h is included indirectly
#endif
Of course your interface headers might use a different #define convention
to record their inclusion in CPP memory. Or no convention, in which case
this approach won't work.
Then rebuild. There are three possibilities:
It builds ok. string.h wasn't compile-critical, and the include for it
can be removed.
The #error trips. string.g was included indirectly somehow
You still don't know if string.h is required. If it is required, you
should directly #include it (see below).
You get some other compilation error. string.h was needed and isn't being
included indirectly, so the include was correct to begin with.
Note that depending on indirect inclusion when your .h or .c directly uses
another .h is almost certainly a bug: you are in effect promising that your
code will only require that header as long as some other header you're using
requires it, which probably isn't what you meant.
The caveats mentioned in other answers about headers that modify behavior
rather that declaring things which cause build failures apply here as well.