CMake + Ninja build does not parallelize across libraries

CMake + Ninja build does not parallelize across libraries - c++

We have a (C++) program that's set up as a series of shared libraries with a nested hierarchy. That is, libB.so uses functions from and thus links against libA.so, libC.so uses functions from and links against both libB.so and libA.so, etc.
For our current CMake+Ninja build system I've noticed that parallel building does not seem to happen across libraries. That is, while Ninja will normally use 12 cores to build, if I've touched a single source file from libA but multiple in libC, ninja will use only a single processor to build just the libA source file until libA.so is linked, at which point it will use 12 processors to compile the libC source files. - And if there's an error in compilation in the libA source, it won't even try to compile the libC files into object files, even if I pass -k to ninja.
Certainly, the linking of libC.so needs to be delayed until libA.so is linked, but the compilation of the source files into object files for the libC sources shouldn't need to be delayed for the linking of libA.
Is there something I'm missing about the optimal way to set up our CMake files to express the dependencies between the libraries? Or is this an insurmountable limitation of how Ninja works?

This was asked recently on the CMake mailing list. The response from one of the devs confirms that the behaviour you report is intentional:
This is unfortunately necessary to get correct builds for CMake projects
in general because we support cases where add_custom_command is used
in library "foo" to generate a header file that is included during
compilation in library "bar" that links to "foo", but we have no good
way to express this dependency besides the ordering dependency of bar
on foo.
While it may be possible for CMake to be improved to relaxed that constraint in some circumstances, this doesn't appear to have been done yet (as of CMake 3.6). There's an open ticket for this in the Kitware issue tracker already.
Update: This appears to have been fixed in CMake 3.9.0, although that change led to a regression which was then subsequently fixed in 3.12.0, or possibly 3.11.2.

Related

Is there a way to interleave static library and dependent dll compilation (MSVC)?

When compiling multiple libraries with MSVC, the compilation process can execute in parallel for each library.
On the other hand, If I have a dll that link together a handful of static libraries, the dll compilation process must wait until all libraries finish their compilation. At least with the workflow I'm currently using.
I know of 2 possible options to specify this type of link dependency with MSVC:
Add the libraries as references in the dll project.
link the libraries manually either via Poperty Pages -> Input -> Additional Dependencies or pragma statement such as #pragma comment(lib, "mylib.lib").
When using option 1, Visual C++ waits until all the libraries are compiled before it initiates the dll compilation. In my view this is a waste since these are only link time dependencies.
When using the second option, I am not sure that Visual C++ will wait until the dependencies compilation is complete before attempting to link them.
So is it possible to specify that a static library dependency is a link time dependency only and interleave static libraries and dependent dll compilation?

Unless you have edited the library's source code, MSVC doesn't compile it (again). But if you have edited it, then obviously you will have to one way or the other (well if you want the features to be in the build). If you edit just the dependent, unless your not talking about circular dependencies, the dependencies wont be recompiled.
When using option 1, Visual C++ waits until all the libraries are compiled before it initiates the dll compilation. In my view this is a waste since these are only link time dependencies.
Yes, you have to build its dependencies before compiling the dependent. What if you have something in the dependent which is something you added recently, after the prior build? You'll have to compile the dependency again right?..
When using the second option, I am not sure that Visual C++ will wait until the dependencies compilation is complete before attempting to link them.
Nope Visual Studio doesn't check if the library exists before compiling it. But then, you have to make sure that the dependencies are compiled before the dependent. Yet again, if you have something new in the dependency, you'll have to compile it.
So is it possible to specify that a static library dependency is a link time dependency only and interleave static libraries and dependent dll compilation?
That's how static libraries work. Libraries are linked at link time, not compile time. The way Visual Studio builds the dependencies is that they maintain a dependency graph. When compiling, it checks if a library needs to be recompiled and if not, leave it.
MSVC also doesn't compile .obj files which aren't changed. So if you want to speed up build time, move (movable) header definitions to source files so that specific source file will be recompiled, rather than all the included source files (the source files which include the header file).
The bottom line is, if your working with a lot of libraries, or even one big library, all the edited sources will be recompiled. Multithreading the build is the only way to speed up compilation. Or you could invest in a dedicated machine to compile it for you (server or something).

Using serial HDF5 C++ with CMake

I want to use the HDF5 C++ bindings in a project build with CMake. So I do the usual:
find_package (HDF5 REQUIRED COMPONENTS CXX)
target_link_libraries(foo PUBLIC ${HDF5_LIBRARIES})
target_include_directories(foo PUBLIC ${HDF5_INCLUDE_DIRS})
This used to work till our cluster (HPC) was upgraded.
Now I get errors during linking:
function MPI::Win::Set_name(char const*): error: undefined reference to 'MPI_Win_set_name'
function MPI::Win::Set_attr(int, void const*): error: undefined reference to 'MPI_Win_set_attr'
Although the versions of HDF5 did not change, the new one seems to require linking against MPI which CMake does not tell me/does automatically.
Am I missing anything? Is the CMake FindHDF5 module flawed or am I required to link against MPI manually when HDF5_IS_PARALLEL was set? How is it possible, that I now need to link MY application against mpi?
Some checks I did:
ldd on both hdf5 libraries shows libmpi
there is no -lmpi on either system for my app
HDF5 1.10.1 is used on both, both build against OpenMPI 2.1.2 with GCC 6.4.0
mpicxx -show shows different output: The new one includes -lmpi_cxx, the old not.
h5c++ -show seems to be the same (some other paths of course)

TL&DR: When HDF_IS_PARALLEL is true, one needs to link against MPI even when not using it.
The HDF5 compiler wrapper calls the MPI compiler wrapper which would add that automatically but the CMake module does not follow this path. Read further for how I found that which might help for similar issues.
I found the solution by isolating the compiler commands invoked to the bare minimum. Using grep I found references to MPI_* already in the object file of the cpp source file. This eliminated the possibility of a relation to the libraries linked, so only differences in includes were possible. I compared the new and old HDF5 include directories with diff -qr and found them to be the same.
Being sure it was the headers I examined the preprocessed files (g++ -E). I compared the new vs old one with vimdiff after some extra steps (replaced header include paths that changed from old to new system to keep the clutter to a minimum) Searching for mpi I found the only difference to be the inclusion of mpi_cxx. This is done by mpi.h which was also easily visible from the preprocessed output.
Checking the MPI installation on both systems confirmed, that new one was build with mpi_cxx support (C++ bindings for MPI added in MPI2 and removed in MPI3, but still optionally available as it seems) and the old one without.
As the C headers only have declarations no references land in the sources but the C++ bindings have definitions. This caused the references to land in the object file and later failed to resolve during linking.
Searching around all I found was "Parallel HDF5 IO requires MPI" but nothing related to CMake. https://www.hdfgroup.org/HDF5/release/cmakebuild.html is also pretty sparse about this and does not mention HDF5_IS_PARALLEL (they do mention that they don't provide that find module).
Given that, I ended up adding an interface target, set the includes and libraries from hdf5, check for HDF_IS_PARALLEL and add the mpi includes and libraries to that target.

How to package c++ dependencies on linux

I'm developing a c++ program on Ubuntu 16.04 using cmake, compiling with g++5 and clang++-3.8.
Now I'd like to make this Program availabile for 14.04, too, but as I'm using a lot of c++14 features I can't just recompile it on that system. Instead, I wanted to ask if/how it is possible to package all dependencies (in particular the c++ standard library) in a way that I can just unpack a folder on the target system and run the app.
Ideally I'm looking for some automated/scripted solution that I can add to my cmake build.
Bonus Question:
For now, this is just a simple command line program for which I can easily recompile all 3rd party dependencies (and in fact I do). In the long run however, I'd also like to port a QT application. Ideally the solution would also work for that scenario.

The worst part of your contitions is an incompatible standard library.
You have to link it statically anyway (see comments to your answer).
A number of options:
Completely static linking:
I think it's easiest way for you, but it requires that you can build (or get by any way) all third-party libs as static. If you can't for some reason it's not your option.
You just build your app as usual and then link it with all libs you need statically (see documentation for your compiler). Thus you get completely dependencies-free executable, it will work on any ABI-compatible system (you may need to check if x86 executable works on x86_64).
Partially static linking
You link statically everything you can and dynamically other. So you distribute all dynamic libs (*.so) along with you app (in path/to/app/lib or path/to/app/ folder), so you don't depend on system libraries. Create your deb package which brings all files into /opt or $HOME/appname folder. You have to load all dynamic libs either "by hand" or ask compiler to do it on linking stage (see documentation).
Docker container
I don't know much about it but I know exactly it requires that docker be installed on target system (not your option).
Useful links:
g++ link options
static linking manual
Finding Dynamic or Shared Libraries
There are similar docs for clang, google it.

Linking libraries in c++

I have a C++ file a.cpp with the library dependency in the path /home/name/lib and the name of the library abc.so.
I do the compilation as follows:
g++ a.cpp -L/home/name/lib -labc
This compiles the program with no errors.
However while running the program, I get the ERROR:
./a.out: error while loading shared libraries: libabc.so.0: cannot open shared object file: No such file or directory
However if before running the program, I add the library path as
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/name/lib;
and compile and run now, it works fine.
Why am I not able to link the library by giving it from the g++ command?

Because shared object libraries are linked at runtime - you either need to add the location to the library search path (as you've done), place it somewhere within an already existing path (like /usr/lib), or add a symbolic link to an existing library path that links to the location where it exists.
If you want to compile the library in so there is no runtime dependency, you'll need a static library (which would be abc.a) - note that this has numerous downsides, however (if the library is updated, you'll need to recompile your executable to incorporate that update).

Why am I not able to link the library by giving it from the g++ command?
You are able to link, and you did link the library succesfully. Otherwise you would not be able to build executable (in your case a.out). The problem you mixed 2 different things: linking with shared libraries and loading them at runtime. Loading shared libraries is a pretty complex concept and described pretty well here Program-Library-HOWTO read from 3.2.

You are linking dynamically, is the default behavior with GCC. LD_LIBRARY_PATH is used to specify directories where to look for libraries (is a way of enforce using an specific library), read: Program-Library-HOWTO for more info. There is also an ld option -rpath to specify libraries search path for the binary being compiled (this info is written in the binary and only used for that binary, the LD_LIBRARY_PATH affect other apps using the same library, probably expecting a new or old version).
Linking statically is possible (but a little tricky) and no dependency would be required (but sometimes is not recommended, because prevent the update of the dependent libraries, for example for security reason, in static linking your always are using the versions of the libraries you have when compiled the binary).

g++ linking .so libraries that may not be compiled yet

Im helping on a c++ application. The application is very large and is spread out between different sub directories. It uses a script to auto generate qt .pro files for each project directory and uses qmake to then generate make files. Currently the libraries are being compiled in alphabetical order.. which is obviously causing linking errors when a library its trying to link isn't built yet.. Is there some kind of g++ flag i can set so it wont error out if a library its trying to link hasn't been built yet? or a way to make it build dependencies first through the qt .pro file?
NOTE:
This script works fine on ubuntu 10.10 because the statements to build the shared libraries didnt require that i use -l(libraryname) to link to my other libraries but ubuntu 11.10 does so it was giving me undefined reference errors when compiling on 11.10.

Have you looked into using Qt Creator as a build environment and IDE? I've personally never used it for development on Ubuntu, but I have used it on Windows with g++, and it works great there. And it appears its already available as a package in the repository.
Some of the advantages you get by using it are:
Qt Creator will (generally) manage the .pro files for you. (If you're like me, you can still add lots of extra stuff here, but it will automatically add .cpp, .h, and .ui files as they are added to the project.)
You can set up inter-project dependencies that will build projects in whatever order they need to link.
You can use its integration with gdb to step through and debug code, as well as jump to the code.
You get autocomplete on Qt signals and slots, as well as inline syntax highlighting and some error checking.
If you're doing GUIs, you can use the integrated designer to visually layout and design your forms.
Referring back to your actual question, I don't think it's possible for a flag to tell gcc to not error when a link fails simply because there is no way for the linker to lazily link libraries. If its linking to static libraries (.a), then it needs to be able to actually copy the implementation of that code into the executable/library. If its dynamically linking (.so), it still needs to verify that the required functions actually exist in the library. If it can't link it during the linkage step, when can it link?
As a bit of an afterthought, if there are cyclic dependencies in your compile process (A depends on B, B on C, and C on A), then you might need to have a fake version of a library get built first, which only has empty stubs for the implementation of each function, and the full definition for each class or object. Then, build everything else while linking to that, and at the end, build the real version of the fake library, and link it to all the other versions that were already linked. I think this would only work on dynamic linking, though.

You could use a subdirs project to have control over the build order (no matter whether the other dev wants it or not :) ).
E.g.
build_all.pro
TEMPLATE=subdirs
CONFIG+=ordered
SUBDIRS=lib2/lib2.pro lib1/lib1.pro app/app.pro
The lib1.pro, lib2.pro, ... are your generated pro files.
Then run qmake once for the build_all.pro and also run make in that directory. This will build lib2 before lib1 and then app.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js