Is there a build tool based on inotify-like mechanism - build

In relatively big projects which are using plain old make, even building the project when nothing has changed takes a few tens of seconds. Especially with many executions of make -C, which have the new process overhead.
The obvious solution to this problem is a build tool based on inotify-like feature of the OS. It would look out when a certain file is changed, and based on that list it would compile this file alone.
Is there such machinery out there? Bonus points for open source projects.

You mean like Tup:
From the home page:
"Tup is a file-based build system - it inputs a list of file changes and a directed acyclic graph (DAG), then processes the DAG to execute the appropriate commands required to update dependent files. The DAG is stored in an SQLite database. By default, the list of file changes is generated by scanning the filesystem. Alternatively, the list can be provided up front by running the included file monitor daemon."

I am just wondering if it is stat()ing the files that takes so long. To check this here is a small systemtap script I wrote to measure the time it takes to stat() files:
# call-counts.stp
global calls, times
probe kernel.function(#1) {
times[probefunc()] = gettimeofday_ns()
}
probe kernel.function(#1).return {
now = gettimeofday_ns()
delta = now - times[probefunc()]
calls[probefunc()] <<< delta
}
And then use it like this:
$ stap -c "make -rC ~/src/prj -j8 -k" ~/tmp/count-calls.stp sys_newstat
make: Entering directory `/home/user/src/prj'
make: Nothing to be done for `all'.
make: Leaving directory `/home/user/src/prj'
calls["sys_newstat"] #count=8318 #min=684 #max=910667 #sum=26952500 #avg=3240
The project I ran it upon has 4593 source files and it takes ~27msec (26952500nsec above) for make to stat all the files along with the corresponding .d files. I am using non-recursive make though.

If you're using OSX, you can use fswatch
https://github.com/alandipert/fswatch
Here's how to use fswatch to for changes to a file and then run make if it detects any
fswatch -o anyFile | xargs -n1 -I{} make
You can run fswatch from inside a makefile like this:
watch: $(FILE)
fswatch -o $^ | xargs -n1 -I{} make
(Of course, $(FILE) is defined inside the makefile.)
make can now watch for changes in the file like this:
> make watch
You can watch another file like this:
> make watch anotherFile

Install inotify-tools and write a few lines of bash to invoke make when certain directories are updated.
As a side note, recursive make scales badly and is error prone. Prefer non-recursive make.

The change-dependency you describe is already part of Make, but Make is flexible enough that it can be used in an inefficient way. If the slowness really is caused by the recursion (make -C commands) -- which it probably is -- then you should reduce the recursion. (You could try putting in your own conditional logic to decide whether to execute make -C, but that would be a very inelegant solution.)
Roughly speaking, if your makefiles look like this
# main makefile
foo:
make -C bar baz
and this
# makefile in bar/
baz: quartz
do something
you can change them to this:
# main makefile
foo: bar/quartz
cd bar && do something
There are many details to get right, but now if bar/quartz has not been changed, the foo rule will not run.

Related

GNU recursive make - how to capture make variables to execute nested makefile

I have a bunch of qmake-generated makefiles each of which call make inside them (recursive make) in a chained manner.
After my build is over, the qmake-generated makefiles are all on disk so you'd think I could just call make on one of them if I wanted to 'replay' one particular makefile. Wrong.
When I try make-ing one, it fails, probably because there's a bunch of (environment) variables that it normally inherits from the calling makefile during the normal build.
Except for the variables, each qmake-generated makefile is pretty self-contained.
QUESTION
How can I simulate the 'normal' environment for a given recursive make so that I can call it in isolation?
I'm thinking I'd have to do something with the --print-data-base output: parse it and then call make with the same vars and values it had during the normal build.
WHY
I'm doing this because I need to modify the compile commands for ONE makefile but it's all controlled by the top-level .conf and I'm getting in way too deep.
I assume the problem is that you need to find this information before you get a chance to make any changes to the generated makefiles. Therefore, this solution is focused on shell commands (I'm assuming you're on Linux, since you don't say).
After starting your build, the first time, use something like:
ps -ef | head -1
ps -ef | grep make
to find make processes involved in the build. The PID column lists the Process ID of the make process, while the PPID column lists that of its parent. Use this information to find the top-level make process. Then, run:
strings /proc/<pid>/environ | sort > /tmp/make_env
env | sort > /tmp/normal_env
diff /tmp/normal_env /tmp/make_env
This will show you how the make process' environment differs from that of your current shell.
Now, that might not solve your problem, because GNU Make allows variables to be specified as commandline parameters. So, you should also check how it's being run:
strings /proc/<pid>/cmdline
That will print each commandline argument of <pid>, on a separate line.
BTW, when variables are passed to GNU Make via commandline arguments, they're handled by overriding any instance of the same variable, that might be contained within the makefile.
Within a makefile, you can see its environment using:
$(info My environment is $(shell env))
You can see its commandline options using:
$(info MAKEFLAGS = $(MAKEFLAGS))
If you only want to see the overrides, use MAKEOVERRIDES:
$(info MAKEOVERRIDES = $(MAKEOVERRIDES))
Finally, the targets can be seen using:
$(info MAKECMDGOALS = $(MAKECMDGOALS))

Dynamically-created 'zip' command not excluding directories properly

I'm the author of a utilty that makes compressing projects using zip a bit easier, especially when you have to compress regularly, such as for updating projects submitted to an application store (like Chrome's Web Store).
I'm attempting to make quite a few improvements, but have run into an issue, described below.
A Quick Overview
My utility's command format is similar to command OPTIONS DEST DIR1 {DIR2 DIR3 DIR4...}. It works by running zip -r DEST.zip DIR1; a fairly simple process. The benefit to my utility, however, is the ability to use a predetermined file (think .gitignore) to ignore specific files/directories, or files/directories which match a pattern.
It's pretty simple -- if the "ignorefile" exists in a target directory (DIR1, DIR2, DIR3, etc), my utility will add exclusions to the zip -r DEST.zip DIR1 command using the pattern -x some_file or -x some_dir/*.
The Issue
I am running into an issue with directory exclusion, however, and I can't quite figure out why (this is probably be because I am still quite the sh novice). I'll run through some examples:
Let's say that I want to ignore two things in my project directory: .git/* and .gitignore. Running command foo.zip project_dir builds the following command:
zip -r foo.zip project -x project/.git/\* -x project/.gitignore
Woohoo! Success! Well... not quite.
In this example, .gitignore is not added to the compressed output file, foo.zip. The directory, .git/*, and all of it's subdirectories (and files) are added to the compressed output file.
Manually running the command:
zip -r foo.zip project_dir -x project/.git/\* -x project/.gitignore
Works as expected, of course, so naturally I am pretty puzzled as to why my identical, but dynamically-built command, does not work.
Attempted Resolutions
I have attempted a few different methods of resolving this to no avail:
Removing -x project/.git/\* from the command, and instead adding each subdirectory and file within that directory, such as -x project/.git/config -x project/.git/HEAD, etc (including children of subdirectories)
Removing the backslash before the asterisk, so that the resulting exclusion option within the command is -x project/.git/*
Bashing my head on the keyboard in angst (I'm really surprised this didn't work, it usually does)
Some notes
My utility uses /bin/sh; I would prefer to keep it that way for maximum compatibility.
I am aware of the git archive feature -- my use of .git/* and .gitignore in the above example is simply as an example; my utility is not dependent on git nor is used exclusively for projects which are git repositories.
I suspected the problem would be in the evaluation of the generated command, since you said the same command when executed directly did right.
So as the comment section says, I think you already found the correct solution. This happens because if you run that variable directly, some things like globs can be expanded directly, instead of passed to the command. And arguments may be messed up, depending on the situation.
Yes, in that case:
eval $COMMAND
is the way to go.

How do I compile multi-file C++ programs in all subdirectories?

I have a bunch of C++ programs each in its own sub-directory. Each sub-directory has a single C++ program in several files -- a .h and a .cpp file for each class plus a main .cpp program. I want to compile each program placing the executable in the corresponding sub-directory. (I also want to run each program and redirect its output to a file that is placed in the corresponding sub-directory but if I can get the compilation to work, I shouldn't have a problem figuring out this part.)
I'm using the bash shell on a UNIX system (actually the UNIX emulator Cygwin that runs on top of Windows).
I've managed to find on the web, a short scrip for compiling one-file programs in the current directory but that's as far as I've gotten. That script is as follows.
for f in *.cpp;
do g++ -Wall -O2 "$f" -o "{f/.cpp/}";
done;
I would really appreciate it someone could help me out. I need to do this task on average once every two weeks (more like 8 weeks in a row, then not for 8 weeks, etc.)
Unless you're masochistic, use makefiles instead of shell scripts.
Since (apparently) each executable depends on all the .h and .cpp files in the same directory, the makefiles will be easy to write -- each will have something like:
whatever.exe: x.obj y.obj z.obj
g++ -o whatever.exe x.obj y.obj z.obj
You can also add a target in each to run the resulting executable:
run:
whatever.exe
With that you'll use make run to run the executable.
Then you'll (probably) want a makefile in the root directory that recursively makes the target in each subdirectory, then runs each (as described above).
This has a couple of good points -- primarily that it's actually built for this kind of task, so it actually does it well. Another is that it takes note of the timestamps on the files, so it only rebuilds the executables that actually need it (i.e., where at least one of the files that executable depends on has been modified since the executable itself was built).
Assuming you have a directory all of whose immediate subdirectories are all c++ programs, then use some variation on this...
for D in */; do cd "$D";
# then either call make or call your g++
# with whatever arguments in here
# or nest that script you found online if it seems to
# be doing the trick for you.
cd ../;
done;
That will move in to each directory, do its thing (whatever you want that to be) and then move back out.

Best practice for dependencies on #defines?

Is there a best practice for supporting dependencies on C/C++ preprocessor flags like -DCOMPILE_WITHOUT_FOO? Here's my problem:
> setenv COMPILE_WITHOUT_FOO
> make <Make system reads environment, sets -DCOMPILE_WITHOUT_FOO>
<Compiles nothing, since no source file has changed>
What I would like to do is have all files that rely on #ifdef statements get recompiled:
> setenv COMPILE_WITHOUT_FOO
> make
g++ FileWithIfdefFoo.cpp
What I do not want to is have to recompile everything if the value of COMPILE_WITHOUT_FOO has not changed.
I have a primitive Python script working (see below) that basically writes a header file FooDefines.h and then diffs it to see if anything is different. If it is, it replaces FooDefines.h and then the conventional source file dependency takes over. The define is not passed on the command line with -D. The disadvantage is that I now have to include FooDefines.h in any source file that uses the #ifdef, and also I have a new, dynamically generated header file for every #ifdef. If there's a tool to do this, or a way to avoid using the preprocessor, I'm all ears.
import os, sys
def makeDefineFile(filename, text):
tmpDefineFile = "/tmp/%s%s"%(os.getenv("USER"),filename) #Use os.tempnam?
existingDefineFile = filename
output = open(tmpDefineFile,'w')
output.write(text)
output.close()
status = os.system("diff -q %s %s"%(tmpDefineFile, existingDefineFile))
def checkStatus(status):
failed = False
if os.WIFEXITED(status):
#Check return code
returnCode = os.WEXITSTATUS(status)
failed = returnCode != 0
else:
#Caught a signal, coredump, etc.
failed = True
return failed,status
#If we failed for any reason (file didn't exist, different, etc.)
if checkStatus(status)[0]:
#Copy our tmp into the new file
status = os.system("cp %s %s"%(tmpDefineFile, existingDefineFile))
failed,status = checkStatus(status)
print failed, status
if failed:
print "ERROR: Could not update define in makeDefine.py"
sys.exit(status)
This is certainly not the nicest approach, but it would work:
find . -name '*cpp' -o -name '*h' -exec grep -l COMPILE_WITHOUT_FOO {} \; | xargs touch
That will look through your source code for the macro COMPILE_WITHOUT_FOO, and "touch" each file, which will update the timestamp. Then when you run make, those files will recompile.
If you have ack installed, you can simplify this command:
ack -l --cpp COMPILE_WITHOUT_FOO | xargs touch
I don't believe that it is possible to determine automagically. Preprocessor directives don't get compiled into anything. Generally speaking, I expect to do a full recompile if I depend on a define. DEBUG being a familiar example.
I don't think there is a right way to do it. If you can't do it the right way, then the dumbest way possible is probably the your best option. A text search for COMPILE_WITH_FOO and create dependencies that way. I would classify this as a shenanigan and if you are writing shared code I would recommend seeking pretty significant buy in from your coworkers.
CMake has some facilities that can make this easier. You would create a custom target to do this. You may trade problems here though, maintaining a list of files that depend on your symbol. Your text search could generate that file if it changed though. I've used similar techniques checking whether I needed to rebuild static data repositories based on wget timestamps.
Cheetah is another tool which may be useful.
If it were me, I think I'd do full rebuilds.
Your problem seems tailor-made to treat it with autoconf and autoheader, writing the values of the variables into a config.h file. If that's not possible, consider reading the "-D" directives from a file and writing the flags into that file.
Under all circumstances, you have to avoid builds that depend on environment variables only. You have no way of telling when the environment changed. There is a definitive need to store the variables in a file, the cleanest way would be by autoconf, autoheader and a source and multiple build trees; the second-cleanest way by re-configure-ing for each switch of compile context; and the third-cleanest way a file containing all mutable compiler switches on which all objects dependant on these switches depend themselves.
When you choose to implement the third way, remember not to update this file unnecessarily, e.g. by constructing it in a temporary location and copying it conditionally on diff, and then make rules will be capable of conditionally rebuilding your files depending on flags.
One way to do this is to store each #define's previous value in a file, and use conditionals in your makefile to force update that file whenever the current value doesn't match the previous. Any files which depend on that macro would include the file as a dependency.
Here is an example. It will update file.o if either file.c changed or the variable COMPILE_WITHOUT_FOO is different from last time. It uses $(shell ) to compare the current value with the value stored in the file envvars/COMPILE_WITHOUT_FOO. If they are different, then it creates a command for that file which depends on force, which is always updated.
file.o: file.c envvars/COMPILE_WITHOUT_FOO
gcc -DCOMPILE_WITHOUT_FOO=$(COMPILE_WITHOUT_FOO) $< -o $#
ifneq ($(strip $(shell cat envvars/COMPILE_WITHOUT_FOO 2> /dev/null)), $(strip $(COMPILE_WITHOUT_FOO)))
force: ;
envvars/COMPILE_WITHOUT_FOO: force
echo "$(COMPILE_WITHOUT_FOO)" > envvars/COMPILE_WITHOUT_FOO
endif
If you want to support having macros undefined, you will need to use the ifdef or ifndef conditionals, and have some indication in the file that the value was undefined the last time it was run.
Jay pointed out that "make triggers on date time stamps on files".
Theoretically, you could have your main makefile, call it m1, include variables from a second makefile called m2. m2 would contain a list of all the preprocessor flags.
You could have a make rule for your program depend on m2 being up-to-date.
the rule for making m2 would be to import all the environment variables ( and thus the #include directives ).
the trick would be, the rule for making m2 would detect if there was a diff from the previous version. If so, it would enable a variable that would force a "make all" and/or make clean for the main target. otherwise, it would just update the timestamp on m2 and not trigger a full remake.
finally, the rule for the normal target (make all ) would source in the preprocessor directives from m2 and apply them as required.
this sounds easy/possible in theory, but in practice GNU Make is much harder to get this type of stuff to work. I'm sure it can be done though.
make triggers on date time stamps on files. A dependent file being newer than what depends on it triggers it to recompile. You'll have to put your definition for each option in a separate .h file and ensure that those dependencies are represented in the makefile. Then if you change an option the files dependent on it would be recompiled automatically.
If it takes into account include files that include files you won't have to change the structure of the source. You could include a "BuildSettings.h" file that included all the individual settings files.
The only tough problem would be if you made it smart enough to parse the include guards. I've seen problems with compilation because of include file name collisions and order of include directory searches.
Now that you mention it I should check and see if my IDE is smart enough to automatically create those dependencies for me. Sounds like an excellent thing to add to an IDE.

Can I have one makefile to build a hierarchical project?

I have several hundred files in a non-flat directory structure. My Makefile lists each sourcefile, which, given the size of the project and the fact that there are multiple developers on the project, can create annoyances when we forget to put a new one in or take out the old ones. I'd like to generalize my Makefile so that make can simply build all .cpp and .h files without me having to specify all the filenames, given some generic rules for different types of files.
My question: given a large number of files in a directory with lots of subfolders, how do I tell make to build them all without having to specify each and every subfolder as part of the path? And how do I make it so that I can do this with only one Makefile in the root directory?
EDIT: this almost answers my question, but it requires that you specify all filenames :\
I'm sure a pure-gmake solution is possible, but using an external command to modify the makefile, or generate an external one (which you include in your makefile) is probably much simpler.
Something along the lines of:
all: myprog
find_sources:
zsh -c 'for x in **/*.cpp; echo "myprog: ${x/.cpp/.o}" >> deps.mk'
include deps.mk
and run
make find_sources && make
note: the exact zsh line probably needs some escaping to work in a make file, e.g. $$ instead of $. It can also be replaced with bash + find.
One way that would be platform independent (I mean independent from shell being in Windows or Linux) is this:
DIRS = relative/path1\
relative/path2
dd = absolute/path/to/subdirectories
all:
#$(foreach dir, $(DIRS), $(MAKE) -C $(dd)$(dir) build -f ../../Makefile ;)
build:
... build here
note that spaces and also the semicolon are important here, also it is important to specify the absolute paths, and also specify the path to the appropriate Makefile at the end (in this case I am using only one Makefile on grandparent folder)
But there is a better approach too which involves PHONY targets, it better shows the progress and errors and stops the build if one folder has problem instead of proceeding to other targets:
.PHONY: subdirs $(DIRS)
subdirs: $(DIRS)
$(DIRS):
$(MAKE) -C $# build -f ../../Makefile
all : prepare subdirs
...
build :
... build here
Again I am using only one Makefile here that is supposed to be applicable to all sub-projects. For each sub-project in the grandchild folder the target "build" is created usinf one Makefile in the root.
I would start by using a combination of the wildcard function:
http://www.gnu.org/software/make/manual/make.html#Wildcard-Function
VPATH/vpath
http://www.gnu.org/software/make/manual/make.html#Selective-Search
and the file functions
http://www.gnu.org/software/make/manual/make.html#File-Name-Functions
For exclusion (ie: backups, as Jonathan Leffler mentioned), use a seperate folder not in the vpath for backups, and use good implicit rules.
You will still need to define which folders to do to, but not each file in them.
I'm of two minds on this one. On one hand, if your Make system compiles and links everything it finds, you'll find out in a hurry if someone has left conflicting junk in the source directories. On the other hand, non-conflicting junk will proliferate and you'll have no easy way of distinguishing it from the live code...
I think it depends on a lot of things specific to your shop, such as source source control system and whether you plan to ever have another project with an overlapping code base. That said, if you really want to compile every source file below a given directory and then link them all, I'd suggest simple recursion: to make objects, compile all source files here, add the resultant objects (with full paths) to a list in the top source directory, recurse into all directories here. To link, use the list.