Makefiles and large projects with complex interdependencies

Makefiles and large projects with complex interdependencies - build

I am trying to come up with a build system for a large project that has a structure like this:
├── deploy
├── docs
├── libs
│   ├── lib1
│   └── lib2
├── scripts
└── tools
├── prog1
└── prog2
My problem is that libs and programs may depend on each other in an order that may change very often. I have looked up info on the internet and I get mixed messages. Some people say that I should just use makefiles and some other say that recursive makefiles are evil.
Besides that, I don't know how to manage dependencies with makefiles (if lib2 needs lib1 to build, how to tell make to build lib1 first). Someone told me that it is easily achievable using the variable VPATH but I have read the documentation and I think it has nothing to do with this.
So my questions are:
Should I really avoid makefiles in this case? If so, what are the alternatives?
If I should use makefiles, how can I manage dependencies automatically and efficiently?

I am in the don't lie to make camp.
Here,
fine-grained dependencies are your friend.
Get away from thinking that you need to express that prog1 depends on lib2. What you really want to say to make is something like:
"To link prog1 you need all the .o files, and all the appropriate .a files."
If you can express this,
it will improve a number of very important parameters for your build:
Parallel build works properly
The .o file for prog1 can compile at the same time as those for
lib2.a
This is more important if you run tests as a part of the build
(you do do that, don't you?)
Time-to-do-nothing is very close to zero
Nothing worse than issuing a build,
and several minutes later you get a
build was already up-to-date (WinCE, I'm looking at you)
Work is properly culled
In other words you can rely on your dependencies,
and never have to do a make clean because you don't trust them
After all,
these goals are really the whole point of using make.
Get them wrong and you might as well use a batch file.
(It's pretty tricky getting them right in a recursive make system BTW.)
Of course,
expressing this cleanly in make does take a bit of effort,
but you did say you had a large project.

I'm not in the recursive makefiles are evil camp -- while it's true that recursive makefiles are not as efficient (they don't maximize concurrency, etc), it's usually negligible. On the flip side, using recursive makefiles makes your project more scalable -- it's simpler to move modules in and out, share with other groups, etc.
In your case, if lib2 depends on lib1, you can just do:
lib2: lib1
make -C libs/lib2
Now make will build lib1 before lib2 (even if you have -j specified).
Now if you have cross-dependencies (something in lib1 depends on something in lib2, which in turn depends on something else in lib1), then you might want to consider either restructuring your directories to avoid the cross-dependencies, or using a single makefile (there's lots of sharp sticks exposed if you try to use recursive makefiles in this case).

Related

Modern CMake: is there a way to build external projects using a CMakePresets.json?

Disclaimer: I'm rather new to C++ development and handling bigger projects, so this might be a wrong approach or tool, so I am very open to different ideas.
I want to be able to provide a sort of package that is a collection of prebuilt libraries / binaries for different platforms, to be used with our other software.
I need to be able to add or remove targets independently without breaking anything. (as in, adding a new library should be as simple as creating a new directory libname and configuring a CMakePresets.json inside)
Ideally, my thoughts were:
create a repository with build instructions for each dependency
have a CI/CD pipeline building all the different versions we need (linux x64, linux ARM, windows)
provide a platform to download specific versions
So, what I had in mind was something like this:
├── A
│   └── CMakePresets.txt
├── B
│   └── CMakePresets.txt
├── C
│   └── CMakePresets.txt
└── CMakeLists.txt (or something like a python script)
A B and C being my dependencies and the way I want to build them, by using a script in the root directory.
I have spent a bit of time trying to figure out a way of doing this cleanly and cross platform but no avail.
I thought of using a CMakeLists.txt because of the FetchContent / ExternalProject_Add commands but haven't really found a way to pass variables that is not tedious.
This is very frustrating because this seems like something that should be relatively common but I feel like i'm missing something...
Perhaps I should be using something like a Python script for some of the tasks (for example cloning the sources, copying the presets in the new directories and build from there) but I really liked the idea of doing everything with CMake, considering it handles a lot of the things I want (cloning a specific git tag etc)
Thank you

You have just described the goal and approach of Conan. It interfaces well with CMake, and uses your "build recipe" approach. You, and Conan, recognize that C++ packages are inherently different from, say, Python or Javascript, in that they have endless variations due to compiler version, libc version, build configuration, etc. The solution is to provide the build instructions, and cache the built result at multiple layers: local machine, private server, public server.
The result is that you specify the packages with the versions you want. The package will just be downloaded if any match the configuration you have specified, else it will be built and cached. With the right CI setup, you can upload the result to your internal server so that most of the time, the necessary packages are all pre-built.
Last time I tried it, I struggled with a few things like transitive dependencies, and you might find yourself maintaining your own internal branches of the build recipes for all the packages you need, so that you can control those transitive dependencies.

Cmake - Library with multiple projects

In my team we made some applications using C++ as main language and when a new project arrives we always end in copy-pasting other project's files when needed . It's frequent and we had a discuss to make an improvement.
So, in order to change that we decided to make an unique library (or many little libraries) that contains everything that is not of the business itself. And we decided to use cmake for that.
But, my question is if is there a way to import this library or these little libraries without compiling them everytime we commit a change.
For example if we have two libraries and two projects, where:
Project A depends on -> library A and Library B
Project B depends on -> library B only
Having our source directory like this:
LIB A
include
src
CMakeLists.txt
LIB B
include
src
CMakeLists.txt
Project A
include
src
CMakeLists.txt
Project B
include
src
CMakeLists.txt
How can we set CmakeLists in project A and B so, when we change something in Library A or B, and re-run cmake and then make in project B for example, all of changes apper in it. And the same for the other project ?.
Is it possible ?

I had the same issue in the past while working on some personal project.
I can offer some suggestions, some of which use your approach towards solving the problem, and some of which don't:
Method 1 (A different approach, multiple source control repositories)
Don't split the code using different CMake files.
Instead, use a single CMake file and split the codebase into smaller repositories.
For example, all the shared utility libraries altogether could've been a single repository, while application A and B, get a repository each.
(You could of course split the utility libraries into multiple repositories as well).
This makes sure that you don't have to hold/update/work with all the projects at once, but only those you actually need. The only downside, is that you have a constrain on the way you checkout these projects, but I don't think it's an issue.
Method 2 (Same approach, using CMake's add_dependencies)
You could define dependencies on the compilation of applications A and B so that the relevant libraries are automatically built if they were updated.
Here is a link to CMake's add_dependencies manual.

Testing with GTest and GMock: shared vs. static libraries

I think this question may violate some of the Q&A standards for the site, as the answer(s) I may receive could be regarded as opinion-driven. Nevertheless, here it goes...
Suppose we're working on a C++ project, using CMake to drive the build/testing/packaging process, and GTest and GMock for testing. Further suppose the structure of our project looks like this:
cool_project
|
|-- source
| |
| |-- module_foo
| | |
| | |-- (bunch of source files)
| |
| |-- module_bar
| |
| |-- (yet more source files)
|
|-- tests
|
|-- module_foo
| |
| |-- (tests for module_foo)
|
|-- module_bar
|
|-- (tests for module_bar)
This is, of course, an oversimplified situation, but you get the idea.
Now well, if these modules are libraries and every test (i.e. every directory under tests) is an executable, we need to link the latter with the former. The thing is, if these libraries are shared, the loader needs of course to find them. An obvious solution is to set the test's working directory to the library's directory, using CMake's set_property. However, if both GTest and GMock were also built as shared libraries, this won't work as they need to be also loaded.
The solutions I came up with were:
Copy both libraries (i.e. GTest and GMock) to the module's build directory. This feels kind of stupid as the main benefit of shared libraries (i.e. to share code among programs) gets completely bypassed and we end up with several copies of these all over the build directory.
Build both GTest and GMock as static libraries instead. This means that we now end up with a copy of both libraries into every executable, which increases its size. Even though we don't have 1000 tests, this feels somehow awkward.
So, given this situation, I would like to know if anyone has ever been struck with it, and what path did he/she take. (If the solution was other than the ones I mentioned, I would be happy to hear all about it.) Ideally, I'd like to be in a position in which I could make && make test and have all the tests run, without having to run any extra script to accommodate things. Having all libraries built as static libraries does the job, but what if I'm building them as shared libraries instead? Must I build them twice? That's silly.
The other problem also runs along these lines, but I think its solution involves a redesign or a similar artifact. Let's suppose module_foo depends on a third-party library, e.g. library_baz. If module_foo links directly to library_baz, then any test on the former would need to load library_baz, even though it may be testing an unrelated functionality. Same issue arises.
Mocking seems like the right thing to do here, but somehow I feel it doesn't make much sense to refactor module_foo in order for it to talk to an interface (be it by virtue of either dynamic or static polymorphism) as it doesn't need such flexibility: library_baz does the job. I suppose some people would say something like 'Sure, you don't need the flexibility today, but who knows tomorrow?'. That seems counter-intuitive to me, trying to preview all possible scenarios a system may run into, but then again, there are people out there with far more experience than me.
Any thoughts?

It seems I was trying to kill a mosquito by using a nuclear missile.
The solution I came up with was to simply build all libraries as static objects when testing. True, I end up with pretty big binaries, but it's not the case that I'll be distributing those.
So, to summarize:
Both GTest and GMock are built as static libraries.
Same goes for the libraries that contain the functionality I'm testing.
Tests then link against those, and thus can be run without messing with the working directory at all.
There are no significant drawbacks to this setup. Whenever I want to give the entire system a try, I simply switch to shared libraries.

That way I see this done (at least on Windows, I don't develop on *nix) is quite independent of any testing:
Simply all binary build artifacts and dependencies that are required to run have to be copied (or directly created in) into a ./bin directory.
Then you can execute any executable from this ./bin directory and all shared libraries are in place there.

CMake: how best to build multiple (optional) subprojects?

Imagine an overall project with several components:
basic
io
web
app-a
app-b
app-c
Now, let's say web depends on io which depends on basic, and all those things are in one repo and have a CMakeLists.txt to build them as shared libraries.
How should I set things up so that I can build the three apps, if each of them is optional and may not be present at build time?
One idea is to have an empty "apps" directory in the main repo and we can clone whichever app repos we want into that. Our main CMakeLists.txt file can use GLOB to find all the app directories and build them (not knowing in advance how many there will be). Issues with this approach include:
Apparently CMake doesn't re-glob when you just say make, so if you add a new app you must run cmake again.
It imposes a specific structure on the person doing the build.
It's not obvious how one could make two clones of a single app and build them both separately against the same library build.
The general concept is like a traditional recursive CMake project, but where the lower-level modules don't necessarily know in advance which higher-level ones will be using them. Yet, I don't want to require the user to install the lower-level libraries in a fixed location (e.g. /usr/local/lib). I do however want a single invocation of make to notice changed dependencies across the entire project, so that if I'm building an app but have changed one of the low-level libraries, everything will recompile appropriately.

My first thought was to use the CMake import/export target feature.
Have a CMakeLists.txt for basic, io and web and one CMakeLists.txt that references those. You could then use the CMake export feature to export those targets and the application projects could then import the CMake targets.
When you build the library project first the application projects should be able to find the compiled libraries automatically (without the libraries having to be installed to /usr/local/lib) otherwise one can always set up the proper CMake variable to indicate the correct directory.
When doing it this way a make in the application project won't do a make in the library project, you would have to take care of this yourself.

Have multiple CMakeLists.txt.
Many open-source projects take this appraoch (LibOpenJPEG, LibPNG, poppler &etc). Take a look at their CMakeLists.txt to find out how they've done this.
Basically allowing you to just toggle features as required.

I see two additional approaches. One is to simply have basic, io, and web be submodules of each app. Yes, there is duplication of code and wasted disk space, but it is very simple to implement and guarantees that different compiler settings for each app will not interfere with each other across the shared libraries. I suppose this makes the libraries not be shared anymore, but maybe that doesn't need to be a big deal in 2011. RAM and disk have gotten cheaper, but engineering time has not, and sharing of source is arguably more portable than sharing of binaries.
Another approach is to have the layout specified in the question, and have CMakeLists.txt files in each subdirectory. The CMakeLists.txt files in basic, io, and web generate standalone shared libraries. The CMakeLists.txt files in each app directory pull in each shared library with the add_subdirectory() command. You could then pull down all the library directories and whichever app(s) you wanted and initiate the build from within each app directory.

You can use ADD_SUBDIRECTORY for this!
https://cmake.org/cmake/help/v3.11/command/add_subdirectory.html

I ended up doing what I outlined in my question, which is to check in an empty directory (containing a .gitignore file which ignores everything) and tell CMake to GLOB any directories (which are put in there by the user). Then I can just say cmake myrootdir and it does find all the various components. This works more or less OK. It does have some side drawbacks though, such as that some third-party tools like BuildBot expect a more traditional project structure which makes integrating other tools with this sort of arrangement a little more work.

The CMake BASIS tool provides utilities where you can create independent modules of a project and selectively enable and disable them using the ccmake command.
Full disclosure: I'm a developer for the project.

How to generate CMakeLists.txt?

I need some pointers/advice on how to automatically generate CMakeLists.txt files for CMake. Does anyone know of any existing generators? I've checked the ones listed in the CMake Wiki but unfortunately they are not suitable for me.
I already have a basic Python script which traverses my project's directory structure and generates the required files but it's really "dumb" right now. I would like to augment it to take into account for example the different platforms I'm building for, the compiler\cross-compiler I'm using or different versions of the libraries dependencies I might have. I don't have much\expert experience with CMake and an example I could base my work or an already working generator could be of great help.

I am of the opinion that you need not use an automated script for generating CMakeLists.Txt as it is a very simple task to write one, after you have understood the basic procedure. Yeah I do agree that understanding the procedure to write one as given in CMake Wiki is also difficult as it is too much detailed.
A very basic example showing how to write CMakeLists.txt is shown here, which I think will be of use to everyone, even someone who is going to write CMakeLists.txt for the first time.

Well i dont have much of an experience in Cmake either, but to perform a cross platform make a lot of files need to be written and modified including the CMakeLists.txt file, i suggest that you use this new tool called the ProjectGenerator Tool, its pretty cool, it does all the extra work needed and makes it easy to generate such files for 3'rd party sources with little effort.
Just read the README carefully before using it.
Link:
http://www.ogre3d.org/forums/viewtopic.php?f=1&t=54842

I think that you are doing this upside down.
When using CMake, you are supposed to write the CMakeLists.txt yourself. Typically, you don't need to handle different compilers as CMake has knowledge about them. However, if you must, you can add code in the CMakeFiles to do different things depending on the tool you are using.

CLion is an Integrated development environment that is fully based on CMake project file.
It is able to generate itself the CMakeLists.txt file when using the import project from source
However this is quite probable that you have to edit this file manually as your project grows and for adding external dependency.

I'm maintaining a C++ software environment that has more than 1000 modules (shared, static libraries, programs) and uses more than 20 third parties (boost, openCV, Qt, Qwt...). This software environment hosts many programs (~50), each one picking up some libraries, programs and third parties. I use CMake to generate the makefiles and that's really great.
However, if you write your CMakeLists.txt as it is recommended to do (declare the module as being a library/program, importing source files, adding dependencies...). I agree with celavek: maintaining those CMakeLists.txt files is a real pain:
When you add a new file to a module, you need to update its CMakeLists.txt
When you upgrade a third party, you need to update the CMakeLists.txt of all modules using it
When you add a new dependency (library A now needs library B), you may need to update the CMakeLists.txt of all programs using A
When you want a new global settings to be changed (compiler setting, predefined variable, C++ standard used), you need to update all your CMakeLists.txt
Then, I see two strategies to adress those issues and likely the one mentioned by OP.
1- Have CMakeLists.txt be well written and be smart enough not to have a frozen behaviourto update themselves on the fly. That's what we have in our software environment. Each module has a standardized file organization (sources are in src folder, includes are in inc folder...) and have simple text files to specify their dependencies (with keywords we defined, like QT to say the module needs to link with Qt). Then, our CMakeLists.txt is a two-line file and simply calls a cmake macro we wrote to automatically setup the module. As a MCVE that would be:
CMakeLists.txt:
include( utl.cmake )
add_module( "mylib", lib )
utl.cmake:
macro( add_module name what )
file(GLOB_RECURSE source_files "${CMAKE_CURRENT_SOURCE_DIR}/src/*.cpp")
include_directories(${CMAKE_CURRENT_SOURCE_DIR}/inc)
if ( what STREQUEL "lib" )
add_library( ${name} SHARED ${source_files} )
elseif ( what STREQUEL "prg" )
add_executable( ${name} ${source_files} )
endif()
# TODO: Parse the simple texts files to add target_link_libraries accordingly
endmacro()
Then, for all situations exposed above, you simply need to update utl.cmake, not the thousand of CMakeLists.txt you have...
Honestly, we are very happy with this approach, the system becomes very easy to maintain and we can easily add new dependencies, upgrade third parties, change some build/dependency strategies...
However, there remains a lot of CMake scripts to be written. And CMake script language sucks...the tool's very powerful, right, but the script's variable scope, the cache, the painful and not so well documented syntax (just to check if a list is empty you must ask for it's size and store this in a variable!), the fact it's not object oriented...make it a real pain to maintain.
So, I'm now convinced the real good approach may be to:
2- completly generate the CMakeLists.txt from a more powerful language like Python. The Python script would do things similar to what our utl.cmake does, instead it would generate a CMakeLists.txt ready to be passed CMake tool (with a format as proposed in HelloWorld, no variable, no function....it would only call standard CMake function).
I doubt such generic tool exists, because it's hard to produce the CMakeLists.txt files that will make everyone happy, you'll have to write it yourself. Note that gen-cmake does that (generates a CMakeLists.txt), but in a very primitive way and it apparently only supports Linux, but it may be a good start point.
This is likely to be the v2 of our software environment...one day.
Note : Additionally, if you want to support both qmake and cmake for instance, a well written Python script could generate both CMakeLists and pro files on demand!

Not sure whether this is a problem original poster faced, but as I see plenty of „just write CMakefile.txt” answers above, let me shortly explain why generating CMakefiles may make sense:
a) I have another build system I am fairly happy with
(and which covers large multiplatform build of big collection
of interconnected shared and static libraries, programs, scripting
language extensions, and tools, with various internal and external
dependencies, quirks and variants)
b) Even if I were to replace it, I would not consider cmake.
I took a look at CMakefiles and I am not happy with the syntax
and not happy with the semantics.
c) CLion uses CMakefiles, and Cmakefiles only (and seems somewhat interesting)
So, to give CLion a chance (I love PyCharm, so it's tempting), but to keep using my build system, I would gladly use some tool which would let me
implement
make generate_cmake
and have all necessary CMakefiles generated on the fly according to the current
info extracted from my build system. I can gladly feed the tool/script with information which sources and headers my app consists of, which libraries and programs it is expected to build, which -I, -L, -D, etc are expected to be set for which component, etc etc.
Well, of course I would be much happier if JetBrains would allow to provide some direct protocol of feeding the IDE with the information it needs
(say, allowed me to provide my own command to compile, to run, and to
emit whatever metadata they really need - I suppose they mainly need incdirs and defines to implement on the fly code analysis, and libpaths to setup LD_LIBRARY_PATH for the debugger), without referring to cmake. CMakefiles as protocol are somewhat complicated.

Maybe this could be helpful:
https://conan.io/
The author has given some speeches about cmake and how to create modular projects using cmake into CPPCon. As far as I know, this tool require cmake, so that I suppose that generate it when you integrate new packages, or create new packages. Recently I read something about how to write a higher level description of the C/C++ project using a YAML file, but not sure if it is part of conan or not (what I read was from the author of conan). I have never used, and it is something pending for me, so that, please if you use it and fit your needs, comment your opinions about it and how it fit your scenario.

I was looking for such a generator but at the end I decided to write my own (partly because I wanted to understand how CMake works):
https://github.com/Aenteas/cmake-generator
It has a couple of additional features such as creating python wrappers (SWIG).
Writing a generator that suits everyone is impossible but I hope it will give you an idea in case you want to make your customized version.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Makefiles and large projects with complex interdependencies - build

Related

Modern CMake: is there a way to build external projects using a CMakePresets.json?

Cmake - Library with multiple projects

Testing with GTest and GMock: shared vs. static libraries

CMake: how best to build multiple (optional) subprojects?

How to generate CMakeLists.txt?

Categories

Resources