Multi-process coverage reports

Multi-process coverage reports - c++

I am trying to monitor the code coverage of my C++ project. As I stated in a previous question, I need to use coroutines and other advanced C++2a features, so I am using clang++ to compile it. I found out here that one can use the -coverage flag when compiling with clang++ (along, obviously, with -O0 and -g).
Along with the executable file, this produces a .gcno file, containing the map of the executable. When the executable is ran, an additional .gcda file is generated, containing the actual profiling data.
I noticed that if I run the executable multiple times the coverage outputs are nicely and properly merged in the .gcda file, which is very good.
Now, I'm wondering if it's safe to run simultaneously multiple instances of the executable.
Before anyone suggests to run the test sequentially: I am running them sequentially, but my application uses a lot of networking, and some of the tests require multiple instances to communicate together (I am using Docker to simulate a network, and netem to get kind-of-realistic link scenarios).
Will running many instances of the same executable together cause any problem? I can imagine that if any locking mechanism is implemented, the coverage data will be safely and atomically written to the .gcda file, and if other executables need to perform the dump they'll wait until the lock is released. However, I couldn't find anywhere a guarantee that this actually happens.

GCOV profiling should be multi-process safe since Clang 7.
In Clang 6 there were two bugs preventing it from working, https://bugs.llvm.org/show_bug.cgi?id=34923 and https://bugs.llvm.org/show_bug.cgi?id=35464, but they are now fixed.
Infact, we are currently using it to collect coverage data for Firefox, which is multiprocess. We both have multiple processes in Firefox itself, and we also run tests (for some specific test suites) in parallel.

Related

Running builds (gcc, make, cmake) inside identical docker, from same source gives binaries with different hashes

I run into "interesting" issue. I am doing build automation for our project in Bamboo. I was quite sure that I am done and then someone asked: why builds from Friday night and Saturday night (no code changes) are different?
I have 2 remote builders. Both have same OS installed (Ubuntu) and updated at to the latest at the same time. Both has the same libraries installed. The compilation is inside docker (image imported from same source)
So I started looking into this and got to the point where I can observer the following:
running compilation twice from same source on the same machine, starting new docker for each compilation produces identical (as per sha1sum) binaries.
running compilation twice from same source on two different machines, produces different binaries.
Source is in the folder of the same name and it is mounted into docker on the path with same name as well.
running compilation on 1 PC with some HW spec, then taking the disk and connecting it to the different PC (with different HW spec) and running compilation from the same source again. Resulting binaries were identical.
Is it possible that somehow the code depends on exact OS image? or any other OS attribute? Any other idea?
It feels like chasing ghosts ...
EDIT: After trying to go step by step, I narrowed this down to cmake. That means: cmake made on two different machines differ (going to start 1-by-1 diffing now). If I put results of cmake of several machines and compile with make from there, I am always getting same binaries. So I believe that the problem is the way cmakefiles are written not compilation itself.
EDIT2: I now know that this is Qt 5.2.1 rcc being non-deterministic issue. During cmake rcc is run and among others calculates some hasheh. Difference in those hashes is what is causing the whole thing to non-deterministic.
If I do cmake and take content of it to 3 different machines and run compilation (make) there I got 100% identical (as per sha1sum) results.
I think that settles it here. Now I need either convince project to upgrade to newer version of Qt where I know how to make rcc deterministic or I need to find how to make rcc deterministic in 5.2.1.

The results could be coincidence. Linking requires that all the required functions go into the executable, but the order is generally not constrained. This could very well depend on the order in which the object files appear on disk.

If you started with identical OS images, then your tests suggest a post-image configuration step introduced a change. Review that configuration. A trivial example would be the hostname. I would diff any intermediate build artifacts, for instance, config.h or whatever cmake writes its output. Try run strings on two binaries (that differ), then diff that output to see if you learn something.
https://tests.reproducible-builds.org/debian/index_issues.html might be very instructive, too.

How to use gcov in cross compiler?

I need to do unit testing for drivers in an arm based board with the help of gcov tool.When gcov is used in a x86 architecture it will create .gcda file after executing the program.But when it comes to an arm based board the .gcda files are not getting created.So,without that i couldn't use the gcov tool.My question is how to use that gcov tool in cross compilation.?.Thanks in advance.

gcov code/data structures are tied to host filesystem and cross-compiler toolchains do not have any port or a configuration to change this behavior. If your object file is ~/my-project/abc.o then the gcov in-memory data structures created/updated by the instrumented code point to ~/my-project/abc.gcda, and all these paths are on your host machine. The instrumented code running on the remote system (in your case the ARM board), as you can see, cannot access these paths and this is the main reason you don't see the .gcda files in the ARM board case.
For a general method on getting the .gcda files to get around this above issue, see https://mcuoneclipse.com/2014/12/26/code-coverage-for-embedded-target-with-eclipse-gcc-and-gcov/. This article presents a hacky method to break into gcov functions and manually dump the gcov data structures into on-host .gcda files.
I used the above mentioned blog to do code coverage for my ARM project. However; I faced another issue of a gcc bug in my version of the toolchain (the GNU arm toolchain version available in October/November 2016), where you would not be able to break into gcov functions and complete the process mentioned in the above blog, as the relevant gcov functions hang with an infinite loop. You may or may not face this issue because I am not sure if the bug is fixed. In case you face this issue, a solution is available in my blog https://technfoblog.wordpress.com/2016/11/05/code-coverage-using-eclipse-gnu-arm-toolchain-and-gcov-for-embedded-systems/.

Is there a way to run C++ Unit Tests tests in parallel?

I'm using Boost Test for a long time now and I ends up having my tests running too slowly. As each test is highly parallel, I want them to run concurrently with all my cores.
Is there a way to do that using the Boost Test Library ? I didn't found any solution. I tried to look a how to write custom test runner, but I didn't much documentation on that point :(
If there is no way, does someone know a good C++ Test Framework to achieve that goal ? I was thinking that Google Test would do the job but apparently it cannot run test in parallel either. Even if the framework has less features than other more known framework, it is not a problem, I just need simple assertions and multi-threaded execution.
Thanks

You could use CTest for this.
CTest is the test driver which accompanies CMake (the build system generator), so you'd need to use CMake to create the build system using your existing files and tests, and in doing so you would then be able to use CTest to run the test executables.
I haven't personally used Boost.Test with CMake (we use GoogleTest), but this question goes into a little more detail on the process.
Once you have the tests added in your CMakeLists file, you can make use of CTest's -j argument to specify how many jobs to run in parallel.

What google is hinting at in the gtest documentation is test sharding - letting multiple machines run the tests by just using command line parameters and environment variables. You could run them all on one machines in separated processes, where you set the GTEST_SHARD_INDEX and GTEST_TOTAL_SHARDS environment variables appropriately.
In principle, nothing is preventing you from starting multiple processes of the test executable with a different filtering parameter (Boost.test, gtest)
Update 2014: https://github.com/google/gtest-parallel

Split the suite to consist of multiple smaller sets each launched with individual binary, and add .PHONY target test to your build sysem depending on all of them. Run as (assuming you are using make) make -jN test

Given that the third bullet point on the open issues list is currently thread safety, I don't believe there's a way to tell Boost test to run tests in multiple threads.
Instead, you will need to find an external test runner that supports running tests in parallel (I would expect this to work by forking off new processes).

What is the difference between compile code and executable code?

I always use the terms compile and build interchangeably.
What exactly do these terms stand for?

Compiling is the act of turning source code into object code.
Linking is the act of combining object code with libraries into a raw executable.
Building is the sequence composed of compiling and linking, with possibly other tasks such as installer creation.
Many compilers handle the linking step automatically after compiling source code.

From wikipedia:
In the field of computer software, the term software build refers either to the process of converting source code files into standalone software artifact(s) that can be run on a computer, or the result of doing so. One of the most important steps of a software build is the compilation process where source code files are converted into executable code.
While for simple programs the process consists of a single file being compiled, for complex software the source code may consist of many files and may be combined in different ways to produce many different versions.

A build could be seen as a script, which comprises of many steps - the primary one of which would be to compile the code.
Others could be
running tests
reporting (e.g. coverage)
static analysis
pre and post-build steps
running custom tools over certain files
creating installs
labelling them and deploying/copying them to a repository

They often are used to mean the same thing. However, "build" may also mean the full process of compiling and linking a whole application (in the case of e.g. C and C++), or even more, including, among others
packaging
automatic (unit and/or integration) testing
installer generation
installation/deployment
documentation/site generation
report generation (e.g. test results, coverage).
There are systems like Maven, which generalize this with the concept of lifecycle, which consists of several stages, producing different artifacts, possibly using results and artifacts from previous stages.

From my experience I would say that "compiling" refers to the conversion of one or several human-readable source files to byte code (object files in C) while "building" denominates the whole process of compiling, linking and whatever else needs to be done of an entire package or project.

Most people would probably use the terms interchangeably.
You could see one nuance : compiling is only the step where you pass some source file through the compiler (gcc, javac, whatever).
Building could be heard as the more general process of checking out the source, creating a target folder for the compiled artifacts, checking dependencies, choosing what has to be compiled, running automated tests, creating a tar / zip / ditributions, pushing to an ftp, etc...

VS 2008 C++ build output?

Why when I watch the build output from a VC++ project in VS do I see:
1>Compiling...
1>a.cpp
1>b.cpp
1>c.cpp
1>d.cpp
1>e.cpp
[etc...]
1>Generating code...
1>x.cpp
1>y.cpp
[etc...]
The output looks as though several compilation units are being handled before any code is generated. Is this really going on? I'm trying to improve build times, and by using pre-compiled headers, I've gotten great speedups for each ".cpp" file, but there is a relatively long pause during the "Generating Code..." message. I do not have "Whole Program Optimization" nor "Link Time Code Generation" turned on. If this is the case, then why? Why doesn't VC++ compile each ".cpp" individually (which would include the code generation phase)? If this isn't just an illusion of the output, is there cross-compilation-unit optimization potentially going on here? There don't appear to be any compiler options to control that behavior (I know about WPO and LTCG, as mentioned above).
EDIT:
The build log just shows the ".obj" files in the output directory, one per line. There is no indication of "Compiling..." vs. "Generating code..." steps.
EDIT:
I have confirmed that this behavior has nothing to do with the "maximum number of parallel project builds" setting in Tools -> Options -> Projects and Solutions -> Build and Run. Nor is it related to the MSBuild project build output verbosity setting. Indeed if I cancel the build before the "Generating code..." step, none of the ".obj" files will exist for the most recent set of "compiled" files. This implies that the compiler truly is handling multiple translation units together. Why is this?

Compiler architecture
The compiler is not generating code from the source directly, it first compiles it into an intermediate form (see compiler front-end) and then generates the code from the intermediate form, including any optimizations (see compiler back-end).
Visual Studio compiler process spawning
In a Visual Studio build compiler process (cl.exe) is executed to compile multiple source files sharing the same command line options in one command. The compiler first performs "compilation" sequentially for each file (this is most likely front-end), but "Generating code" (probably back-end) is done together for all files once compilation is done with them.
You can confirm this by watching cl.exe with Process Explorer.
Why code generation for multiple files at once
My guess is Code generation being done for multiple files at once is done to make the build process faster, as it includes some things which can be done only once for multiple sources, like instantiating templates - it has no use to instantiate them multiple times, as all instances but one would be discarded anyway.
Whole program optimization
In theory it would be possible to perform some cross-compilation-unit optimization as well at this point, but it is not done - no such optimizations are ever done unless enabled with /LTCG, and with LTCG the whole Code generation is done for the whole program at once (hence the Whole Program Optimization name).
Note: it seems as if WPO is done by linker, as it produces exe from obj files, but this a kind of illusion - the obj files are not real object files, they contain the intermediate representation, and the "linker" is not a real linker, as it is not only linking the existing code, it is generating and optimizing the code as well.

It is neither parallelization nor code optimization.
The long "Generating Code..." phase for multiple source files goes back to VC6. It occurs independent of optimizations settings or available CPUs, even in debug builds with optimizations disabled.
I haven't analyzed in detail, but my observations are: They occur when switching between units with different compile options, or when certain amounts of code has passed the "file-by-file" part. It's also the stage where most compiler crashes occured in VC6 .
Speculation: I've always assumed that it's the "hard part" that is improved by processing multiple items at once, maybe just the code and data loaded in cache. Another possibility is that the single step phase eats memory like crazy and "Generating code" releases that.
To improve build performance:
Buy the best machine you can afford
It is the fastest, cheapest improvement you can make. (unless you already have one).
Move to Windows 7 x64, buy loads of RAM, and an i7 860 or similar. (Moving from a Core2 dual core gave me a factor of 6..8, building on all CPUs.)
(Don't go cheap on the disks, too.)
Split into separate projects for parallel builds
This is where 8 CPUS (even if 4 physical + HT) with loads of RAM come to play. You can enable per-project parallelization with /MP option, but this is incompatible with many other features.

At one time compilation meant parse the source and generate code. Now though, compilation means parse the source and build up a symbolic database representing the code. The database can then be transformed to resolved references between symbols. Later on, the database is used as the source to generate code.
You haven't got optimizations switched on. That will stop the build process from optimizing the generated code (or at least hint that optimizations shouldn't be done... I wouldn't like to guarantee no optimizations are performed). However, the build process is still optimized. So, multiple .cpp files are being batched together to do this.
I'm not sure how the decision is made as to how many .cpp files get batched together. Maybe the compiler starts processing files until it decides the memory size of the database is large enough such that if it grows any more the system will have to start doing excessive paging of data in and out to disk and the performance gains of batching any more .cpp files would be negated.
Anyway, I don't work for the VC compiler team, so can't answer conclusively, but I always assumed it was doing it for this reason.

There's a new write-up on the Visual C++ Blog that details some undocumented switches that can be used to time/profile various stages of the build process (I'm not sure how much, if any, of the write-up applies to versions of MSVC prior to VS2010). Interesting stuff which should provide at least a little insight into what's going on behind the scenes:
http://blogs.msdn.com/vcblog/archive/2010/04/01/vc-tip-get-detailed-build-throughput-diagnostics-using-msbuild-compiler-and-linker.aspx
If nothing else, it lets you know what processes, dlls, and at least some of the phases of translation/processing correspond to which messages you see in normal build output.

It parallelizes the build (or at least the compile) if you have a multicore CPU
edit: I am pretty sure it parallelizes in the same was as "make -j", it compiles multiple cpp files at the same time (since cpp are generally independent) - but obviously links them once.
On my core-2 machine it is showing 2 devenv jobs while compiling a single project.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js