compile concurrently, link serially - c++

I have big cmake/c++/linux project: a lot of small static libraries, a few big and interdependent static libraries, a few big executable binaries. Single binary with debug symbols is several GB. There are ~10 such binaries (worker, testA, testB, testC...). Compilation usually takes more time than we would like, but we have fast build server and we use make -j20. The worst though is linking. Single linking takes about 60 seconds and 4GB RAM. But when all final binaries are linked at the same time (happens often when 1 small sublibrary was modified, little to recompile, a lot to relink), 10 linkers use 40GB RAM (for 1 developer, there may be more) and very long time. IO is most likely the bottleneck.
We have many developers on 1 strong server and everybody uses make -j20 -l30 so that we don't overload CPU. But we don't have method for limiting number of concurrent linkers. It would be great to limit number of working linkers globally on server, but per make invocation would help as well. Ideally make -j20 -l30 --concurrent-linkers=2. Is it possible?
We use gold linker. We are in progress of separating smaller, independent modules, but this will take a long time.

You could try something like:
$ cat Makefile
OBJS := foo bar baz...
EXES := qux quux quuz...
.PHONY: all
all: $(OBJS)
$(MAKE) -j $(concurrent-linkers) $(EXES)
$(OBJS): ...
<compile>
$(EXES): ...
<link>
And call it with:
$ make -j20 -l30 concurrent-linkers=2
Basically, it separates the build in two make invocations, one for compilation and one for link, with different -j options. The main drawback is that all compilations must be finished before the first link starts. A better solution would be to design a simple link job server (a simple shell script with a bit of flock and tag files would make it) and delegate it the link jobs. But if you can live with this...
Demo with a dummy Makefile:
$ cat Makefile
OBJS := a b c d e f
EXES := u v w x y z
.PHONY: all
all: $(OBJS)
$(MAKE) -j $(concurrent-linkers) $(EXES)
$(OBJS) $(EXES):
#printf 'building $#...\n' && sleep 2 && printf 'done\n' && touch $#
$ make -j20 -l30 concurrent-linkers=2
building a...
building d...
building b...
building c...
building e...
building f...
done
done
done
done
done
done
make -j 2 u v w x y z
make[1]: warning: -jN forced in submake: disabling jobserver mode.
make[1]: Entering directory 'foobar'
building u...
building v...
done
done
building w...
building x...
done
done
building y...
building z...
done
done
make[1]: Leaving directory 'foobar'
As you can see all $(OBJS) targets are built in parallel while the $(EXES) targets are built 2 (maximum) at a time.
EDIT If your makefile is generated by CMake there are at least two options:
Tune your CMake files such that CMake generates two different makefiles: one for compilation and one for link. Then write a simple wrapper makefile like:
.PHONY: myAll myCompile
myAll: myCompile
$(MAKE) -j $(concurrent-linkers) -f Makefile.link
myCompile:
$(MAKE) -f Makefile.compilation
Convince CMake (if it is not already the case) to generate a makefile that defines two make variables: one (OBJS) set to the list of all object files and one (EXES) set to the list of all executable. Then write a simple wrapper makefile like:
.DEFAULT_GOAL := myAll
include CMake.generated.Makefile
.PHONY: myAll
myAll: $(OBJS)
$(MAKE) -j $(concurrent-linkers) $(EXES)
A very similar solution exists if, instead, CMake generates two phony targets, one for all object files and the other for all executable:
.DEFAULT_GOAL := myAll
include CMake.generated.Makefile
.PHONY: myAll
myAll: cmake-target-for-compilation
$(MAKE) -j $(concurrent-linkers) cmake-target-for-link

Related

Call gnumake on all subdirs in parallel (-j) and only then run the linker-rule last (i.e. order important)

I have a c++ makefile project. It works great for non-parallel building. It works 99% for parallel building... the only problem I have is that I can't get my final executable link-line to run last (it must be the last thing that happens).
I have some constraints: I don't want to have any PHONY dependencies on my link line because this causes it to re-link every time. I.e. once my target is built, when I re-build it should not be re-linked.
Here is (slightly contrived) minimal example. Please don't try to pick holes in it, its really here just to show the problem, its not real, but the problem I am showing is. You should be able to just run this and see the same issue that I am.
# Set the default goal to build.
.DEFAULT_GOAL = build
#pretend subdirs (these don't really exist but it does not matter so long as they always try to be built)
MAKE_SUB_DIRS = 1 2 3
#pretend shared objects that are created by the pretend makefile sub directories (above)
OUTPUTS = out1.so out2.so out3.so
# Top level build goal - depends on all of the subdir makes and the target.out
.PHONY: build
build: $(MAKE_SUB_DIRS) target.out
#echo build finished
# Takes 1 second to build each of these pretend sub make directories. PHONY so always runs
.PHONY: $(MAKE_SUB_DIRS)
$(MAKE_SUB_DIRS):
#if [ ! -f out$#.so ] ; then echo making $#... ; sleep 1 ; echo a > out$#.so ; fi
# The main target, pretending that it needs out1,2 and 3 to link
# Should only run when target.out does not exist
# No PHONY deps allowed here
target.out:
#echo linking $#...
#ls $(OUTPUTS) > /dev/null
#cat $(OUTPUTS) > target.out
# Clean for convinience
clean:
#rm -rf *.so target.out
Now, I don't really care about make working, what I want is make -j to work. Here is me trying to run it:
admin#osboxes:~/sandbox$ make clean
admin#osboxes:~/sandbox$
admin#osboxes:~/sandbox$ make -j - 1st attempt
making 1...
making 2...
linking target.out...
making 3...
ls: cannot access 'out1.so': No such file or directory
ls: cannot access 'out2.so': No such file or directory
ls: cannot access 'out3.so': No such file or directory
makefile:24: recipe for target 'target.out' failed
make: *** [target.out] Error 2
make: *** Waiting for unfinished jobs....
admin#osboxes:~/sandbox$
admin#osboxes:~/sandbox$ make -j - 2nd attempt
linking target.out...
build finished
admin#osboxes:~/sandbox$
admin#osboxes:~/sandbox$ make -j - 3rd attempt
build finished
admin#osboxes:~/sandbox$
So I highlighted my three attempts to run it.
Attempt 1: you can see all 4 dependencies of build are started at the same time (approx). Since each of the makeing x... take 1 second and the linking is nearly instant we see my error. However all the three "libraries" are build correctly.
Attempt 2: The libraries only get created if they don't already exists (that's bash code - pretending to do what a makefile might have done). In this case they are already created. So the Linking passes now since it just requires the libraries to exist.
Attempt 3: nothing happens because nothing needs to :)
So you can see all the steps are there, its simply a matter of ordering them. I would like the the make sub dirs 1, 2, 3 to build in any order in parallel and then only once they are all completed I want target.out to run (i.e. the linker).
I don't want to call it like this though: $(MAKE) target.out because in my real makefile I have lots of variables all setup...
I have tried looking at (from othe answers) .NOT_PARALLEL and using the dep order operator | (pipe), and I have tried order a load of rules to get target.out to be last.... but the -j option just ploughs through all of these and ruins my ordering :( ... there must be some simple way to do this?
EDIT: add an example of ways to pass variables to sub-makes. Optimized a bit by adding $(SUBDIRS) to the prerequisites of build instead of making them in its recipe.
I am not sure I fully understand your organization but one solution to deal with sub-directories is as follows. I assume, a bit like in your example, that building sub-directory foo produces foo.o in the top directory. I assume also that your top Makefile defines variables (VAR1, VAR2...) that you want to pass to the sub-makes when building your sub-directories.
VAR1 := some-value
VAR2 := some-other-value
...
SUBDIRS := foo bar baz
SUBOBJS := $(patsubst %,%.o,$(SUBDIRS))
.PHONY: build clean $(SUBDIRS)
build: $(SUBDIRS)
$(MAKE) top
$(SUBDIRS):
$(MAKE) -C $# VAR1=$(VAR1) VAR2=$(VAR2) ...
top: top.o $(SUBOBJS)
$(CXX) $(LDFLAGS) -o $# $^ $(LDLIBS)
top.o: top.cc
$(CXX) $(CXXFLAGS) -c $< -o $#
clean:
rm -f top top.o $(SUBOBJS)
for d in $(SUBDIRS); do $(MAKE) -C $$d clean; done
This is parallel safe and guarantees that the link will take place only after all sub-builds complete. Note that you can also export the variables you want to pass to sub-makes, instead of passing them on the command line:
VAR1 := some-value
VAR2 := some-other-value
...
export VAR1 VAR2 ...
Normally you would just add the lib files as prerequisites of target.out:
target.out: $(OUTPUTS)
#echo linking $#...
The thing is, this will relink target.out if any of the output lib files are newer. Normally this is what you want (if the lib has changed, you need to relink target), but you specifically say you do not.
GNU make provides an extension called "order only prerequisites", which you put after a |:
target.out: | $(OUTPUTS)
#echo linking $#...
now, target.out will only be relinked if it does not exist, but in that case, it will still wait until after $(OUTPUTS) have finished being built
If your $(OUTPUT) files are build by subsirectory makes, you may find you need a rule like:
.PHONY: $(OUTPUT)
$(OUTPUT):
$(MAKE) -C $$(dirname $#) $#
to invoke the recursive make, unless you have other rules that will invoke make in the subdirectories
Ok, so I have found "a" solution... but it goes a little bit against what I wanted and is therefore ugly (but not that that ugly):
The only way I can fathom to ensure order in parallel build (again from other answers I read) is like this:
rule: un ordered deps
rule:
#echo this will happen last
Here the three deps will be made (or maked?) in any order and then finally the echo line will be run.
However the thing that I want to do is a rule and specifically so, such that it checks if anything has changed or if the file does not exist - and then, and only then, runs the rule.
The only way I know of to run a rule from within the bode of another rule is to recursively call make on it. However I get the following issues just calling make recursively on the same makefile:
Variables are not passed in by default
Many of the same rules will be re-defined (not allowed or wanted)
So I came up with this:
makefile:
# Set the default goal to build.
.DEFAULT_GOAL = build
#pretend subdirs (these don't really exist but it does not matter so long as they always try to be built)
MAKE_SUB_DIRS = 1 2 3
#pretend shared objects that are created by the pretend makefile sub directories (above)
OUTPUTS = out1.so out2.so out3.so
# Top level build goal - depends on all of the subdir makes and the target.out
export OUTPUTS
.PHONY: build
build: $(MAKE_SUB_DIRS)
#$(MAKE) -f link.mk target.out --no-print-directory
#echo build finished
# Takes 1 second to build each of these pretend sub make directories. PHONY so always runs
.PHONY: $(MAKE_SUB_DIRS)
$(MAKE_SUB_DIRS):
#if [ ! -f out$#.so ] ; then echo making $#... ; sleep 1 ; echo a > out$#.so ; fi
# Clean for convinience
clean:
#rm -rf *.so target.out
link.mk:
# The main target, pretending that it needs out1,2 and 3 to link
# Should only run when target.out does not exist
# No PHONY deps allowed here
target.out:
#echo linking $#...
#ls $(OUTPUTS) > /dev/null
#cat $(OUTPUTS) > target.out
So here I put the linker rule into a separate makefile called link.mk, this avoids recursive make calling on the same file (and therefore with re-defined rules). But I have to export all the variables I need to pass through... which is ugly and adds a bit of a maintenance overhead if those variables change.
... but... it works :)
I will not mark this any time soon, because I am hopeful some genius will point out a neater/better way to do this...

How to use makedepend in a non-standard makefile name

I am trying to use makedepend in a makefile named Makefile_abc.
Normally when I have to build a target trg, I say
make -f Makefile_abc trg
and this works beautifully.
I have added following lines in this makefile.
dep:
makedepend main.c
Now, when I do,
make -f Makefile_abc dep
I get the error,
makedepend: error: [mM]akefile is not present
make: *** [depend] Error 1
If I rename my makefile as Makefile, then following command works fine,
make depend
So, I am looking for a way to use makedepend on non-standard makefile names.
This is a basic 'read the manual' question.
Looking at makedepend(1), you need -fMakefile_abc in the recipe for the target dep (optionally with a space between -f and Makefile_abc):
dep:
makedepend -fMakefile_abc main.c
To update the dependencies, you'd run:
$ make -f Makefile_abc dep
This would cause make to run:
makedepend -fMakefile_abc main.c
(Note that the 'standard' — most common — name for the target is depend rather than dep, so you'd normally run make -fMakefile_abc depend or, with a plain makefile file, make depend.)
If you're using GNU Make, you might also add another line to Makefile_abc:
.PHONY: dep # Or depend, depending…
This tells make that there won't be a file dep created by the rule.
You can often get information about how to run a command by using makedepend --help or makedepend -: — the first may (or may not) give a useful help message outlining options, and the second is very unlikely to be a valid option which should generate a 'usage' message that summarizes the options.

ARM GNU Compiler -j[jobs] option exist

I cannot find an option for the ARM GNU toolchain to compile multiple c files at the same time. I use make -j5 all the time when compiling using gcc. Helps speed up compile time dramatically. Be nice if ARM GNU had a similar option.
Here is my setup:
--Fedora 20
--Core i5
--Eclipse with ARM GNU plugin
--ARM GNU 4.8-2014-q1-update (from here: https://launchpad.net/gcc-arm-embedded)
--Target uP: STM32F205RB
I've tried to get CodeSourcery GCC working, unsuccessfully. ARM GNU seemed to work well after little setup. CodeSourcery GCC should have a -j option, as we cross compile all the time for embedded linux.
GCC is not multi-threaded. The -j<n> switch is specific to make build system, not the compiler. It tells make how many tasks it can run in parallel.
If you run make -j4 you can observe in your task manager/top/process list that it tries to run 4 instances of GCC compiling 4 independent *.c files at the same time.
To make use of -j command you must have a Makefile in your project that can benefit from it. It should have multiple independent targets, so that they can be launched in parallel.
If you are lost in the terminology, I advice you to look at make tutorial, such as this one:
http://mrbook.org/tutorials/make/
The usual strategy here is to have a separate target for every c or cpp file in our project. That way make can easily spawn multiple compiler processes for each compilation unit. Once all *.o files are generated, they are linked.
Let's see at this example snippet:
SRCS := main.c func.c other.c another_file.c ...
OBJS := $(SRCS:.c=.o)
objects: $(OBJS)
%.o: %.c
gcc -o $(#) -c $(<)
We pass a list of c files, change them to corresponding o file using suffix substitution and treat the list of *.o files as targets. Now the make can compile each c file in parallel.
In contrast, if we do something like this:
SRCS := main.c func.c other.c another_file.c ...
all:
gcc $(SRCS) -o a.out
...we won't benefit from -j switch at all, because there is only one target.

Feedback about using make on a project with many subdirectories

To my Object Oriented Programming course, I must do a final proyect (academic purposes). I want to make a proyect "the right way" (ie: makefile, modular, DRY, easily scalable, etc) in order to better understand classes, makefile and C++.
The idea I've got is to have a "tree-source-file-structure-directory" so in each subfolder i'd got the source files with it's headers, test files and single makefile.
So if I want to work on the interface, I go to the subfolder interface, I edit the files, I run the tests, if everything is OK, simply I link the objects together on my root directory. Same thing if I want to work on my data structure, and so goes on. The nice feature is that in every subfolder resides along the source code and the object files, so the linker in my root directory would search for object files already compiled on subfolders and link them together
I've been searching on the internet, and I could see many different solutions:
-Doing make recursively, eg:
SUBDIRS=eda
.PHONY: subdirs $(SUBDIRS)
$(SUBDIRS):
$(MAKE) -C $#
The problem I found is that my prerequisites on "eda" folder would be "quirky"
-Using Automatic Variable $(#D), but I didn't quite understand how it works
-Maybe using wildcard function, but I am a little confused about this option.
Anyways, the most tempting solution for me was the first one (using make recursively), but I found lot of comments saying that it is not recommended to use make recursively Interesting article
So I want to ask you guys some advices: How can I accomplish my objectives and have every important module in a separate folder? is recursive make the best solution? Maybe I should dive in "automake"? Or perhaps it would be better to take all the object files to a new "object" subfolder on root directory and then link them together?
By the way, I took the inspiration to make my proyect with this tree structure by sniffing Amarok source code: it has a subfolder called "src", and when you enter there, you can see a lot of subfolders: equalizer, playlist, dynamic, statusbar, core, playlistgenerator, playlistmanager, etc. And many subfolders have their own subdirectories... and the result is an incredible music player. If this method works fine to the Amarok team... I could be able to do something similar!
Any comments, feedback, suggestions and others are welcome, thanks in advance!
EDIT #1
Beta, I have some implicit rules (suffix) and a target for the linker that needs a object on my eda folder. Every other prerequisite of this target is built on the current folder.
The problem I have, is that when I run make to build that target, it takes the name of my prerequisite on "eda" folder as a target to build with the implicit rule. That's the tricky/unclean part of the makefile recursive on my proyect: I guess I must create a special implicit rule for every object file that make must search in a subfolder.
That's why I want some feedback: ¿Are there better alternatives? Or the advantages of using make recursive in my proyects overwhelm the other alternatives?
Anyways, if gives you better understanding, here is my draft Makefile (it is in spnish-english :P )
#Makefile hecho para las pruebas de los archivos dentro de esta carpeta
FLAGS=-g -DDEBUG
OUT_TI=TIndividuo
OUT_TP=TProfesor
OUT_TA=TAula
.SUFFIXES: .cpp .c .h .o
.c.o: ; cc $(FLAGS) -c $*.c
.cc.o: ; gcc $(FLAGS) -c $*.cc
.cpp.o: ; g++ $(FLAGS) -c $*.cpp
SUBDIRS=eda
.PHONY: subdirs $(SUBDIRS)
$(OUT_TI): eda/TAula.o CandidatoHorario.o TIndividuo.o TIndividuoTest.o TGen.o
g++ CandidatoHorario.o TIndividuo.o TIndividuoTest.o TGen.o eda/TAula.o -o $#
CandidatoHorario.o: CandidatoHorario.cpp CandidatoHorario.h
TIndividuoTest.o: TIndividuoTest.cpp TIndividuo.h
TIndividuo.o: TIndividuo.cpp TIndividuo.h
TGen.o: TGen.cpp
#eda/TAula.o: eda/TAula.cpp eda/TAula.h
# g++ -c eda/TAula.cpp -o $#
$(SUBDIRS):
$(MAKE) -C $#
clean:
rm -f *.o $(OUT_TI) $(OUT_TA) eda/TAula.o
The "Recursive Make Considered Harmful" is certainly a paper to read and to understand. Afterwards, your selection of tools should really be tailored to your specific projects.
For small projects that you initiate (or where you have the influence to guide high-level decisions), I would recommend spending a bit of time identifying your preferences (project layout, directory structure, unit test framework, etc.) and writing a generic set of makefiles that you will use for all your projects. You could easily end up with a generic master makefile, possibly a few more generic included makefiles for modularity (e.g. to build libraries, or unit tests or automatic dependency detection). You could also provide some extra flexibility with optional included configuration makefiles (e.g. specifying the order of your libraries). Most of the DAG construction would rely heavily on the content of your project directories. An example could look like:
include config.mk
sources := $(wildcard *.cpp)
programs := $(sources:%.cpp=%)
lib_sources := $(wildcard lib/*/*.cpp)
lib_dirs := $(sort $(patsubst %/, %, $(dir $(lib_sources:lib/%=%))))
lib_objects := $(lib_sources:lib/%.cpp=$(BUILD)/%.o)
all: $(programs:%=$(BUILD)/%)
.PRECIOUS: %/.sentinel %.d
# for dependencies on directories in build tree
%/.sentinel:
#mkdir -p $* && touch $#
clean:
$(RM) -r $(BUILD)
programs_aux:=$(programs)
include $(foreach program, $(programs), program.mk)
lib_dirs_aux:=$(lib_dirs)
include $(foreach lib_dir, $(lib_dirs), lib.mk)
# this should probably be in lib.mk
-include $(lib_objects:%.o=%.d)
The included program.mk (and lib.mk) would contain some boilerplate code to iterate over the lists of programs (and lists of libraries) and would factor out the specific parts of the makefile to build programs (and libraries).
To help with the implementation of such makefiles, you could use some standard library like http://gmsl.sourceforge.net.
This approach has several issues:
* it leads to makefiles that require strong skills
* it doesn't always scale very well to very large projects
* it relies heavily on "convention instead of configuration" and requires a clear upfront definition of the conventions that you will use (IMO this is good others might think that it lacks flexibility)
* life is too short to mess around with makefiles
Otherwise, I would suggest using higher-level configuration tools such as SCons or CMake as they tend to be conceptually simpler and they also allow other flavours of generators.

Which build system will do this the most 'naturally'?

Instead of the flat structure my code currently has, I want to organize it into modules contained in sub-folders (and perhaps sub-sub folders if the modules get big enough).
Each module will have one or more translation units which will each produce a .o file.
The final target would be to mash up all these object files into a static library (for now).
I am using plain 'make' and it is already complicated enough.
Is there a system in which the specified model comes naturally or with much less effort compared to writing makefiles by hand ?
(If you are going to recommend cmake, I need some hints as I have already tried and could not come up with a good solution.)
Some paraphrased bits from my current project's makefile that may help you out with good old fashioned GNU make:
SOURCEDIR := dir1 dir2/subdir1 dir3 dir4 dir5/subdir1 dir6/subdir1
SOURCES := $(foreach srcdir,$(SOURCEDIR),$(wildcard $(srcdir)/*.c))
OBJECTS := $(patsubst %.c,build/%.o,$(SOURCES))
OBJDIRS := $(addprefix build/,$(SOURCEDIR))
MAKEDEPS := $(patsubst %.c,build/%.d,$(SOURCES))
all: example
$(OBJDIRS):
-mkdir -p $#
build: $(OBJDIRS)
build/%.o : %.c | build
cc -MMD -c -o $# $<
example: $(OBJECTS)
cc -o $# $(OBJECTS)
-include $(MAKEDEPS)
In essence, it builds all of the source files found in the designated directories into object files located in subdirectories of the build directory in a hierarchy that parallels their source directory layout (important if multiple source files have the same name) and then links the results into an executable example.
As a bonus, dynamic dependency generation and inclusion via the MAKEDEPS variable and clang's -MMD flag.
It really depends upon your purposes: Build packages are generally intended for the audience rather than the performer. Often, they take into consideration the disparate environments into which people deploy. I played around with 'tup,' which seemed more a way of generating an executable as quickly as possible after an edit. 'Premake' seems to shoot at multiple platforms, but I found specifying compiler options no more enlightened than with Cmake.
It looks as though you've found a good Makefile tutor, so I'll leave my observations at that.