Best practice for dependencies on #defines? - c++

Is there a best practice for supporting dependencies on C/C++ preprocessor flags like -DCOMPILE_WITHOUT_FOO? Here's my problem:
> setenv COMPILE_WITHOUT_FOO
> make <Make system reads environment, sets -DCOMPILE_WITHOUT_FOO>
<Compiles nothing, since no source file has changed>
What I would like to do is have all files that rely on #ifdef statements get recompiled:
> setenv COMPILE_WITHOUT_FOO
> make
g++ FileWithIfdefFoo.cpp
What I do not want to is have to recompile everything if the value of COMPILE_WITHOUT_FOO has not changed.
I have a primitive Python script working (see below) that basically writes a header file FooDefines.h and then diffs it to see if anything is different. If it is, it replaces FooDefines.h and then the conventional source file dependency takes over. The define is not passed on the command line with -D. The disadvantage is that I now have to include FooDefines.h in any source file that uses the #ifdef, and also I have a new, dynamically generated header file for every #ifdef. If there's a tool to do this, or a way to avoid using the preprocessor, I'm all ears.
import os, sys
def makeDefineFile(filename, text):
tmpDefineFile = "/tmp/%s%s"%(os.getenv("USER"),filename) #Use os.tempnam?
existingDefineFile = filename
output = open(tmpDefineFile,'w')
output.write(text)
output.close()
status = os.system("diff -q %s %s"%(tmpDefineFile, existingDefineFile))
def checkStatus(status):
failed = False
if os.WIFEXITED(status):
#Check return code
returnCode = os.WEXITSTATUS(status)
failed = returnCode != 0
else:
#Caught a signal, coredump, etc.
failed = True
return failed,status
#If we failed for any reason (file didn't exist, different, etc.)
if checkStatus(status)[0]:
#Copy our tmp into the new file
status = os.system("cp %s %s"%(tmpDefineFile, existingDefineFile))
failed,status = checkStatus(status)
print failed, status
if failed:
print "ERROR: Could not update define in makeDefine.py"
sys.exit(status)

This is certainly not the nicest approach, but it would work:
find . -name '*cpp' -o -name '*h' -exec grep -l COMPILE_WITHOUT_FOO {} \; | xargs touch
That will look through your source code for the macro COMPILE_WITHOUT_FOO, and "touch" each file, which will update the timestamp. Then when you run make, those files will recompile.
If you have ack installed, you can simplify this command:
ack -l --cpp COMPILE_WITHOUT_FOO | xargs touch

I don't believe that it is possible to determine automagically. Preprocessor directives don't get compiled into anything. Generally speaking, I expect to do a full recompile if I depend on a define. DEBUG being a familiar example.
I don't think there is a right way to do it. If you can't do it the right way, then the dumbest way possible is probably the your best option. A text search for COMPILE_WITH_FOO and create dependencies that way. I would classify this as a shenanigan and if you are writing shared code I would recommend seeking pretty significant buy in from your coworkers.
CMake has some facilities that can make this easier. You would create a custom target to do this. You may trade problems here though, maintaining a list of files that depend on your symbol. Your text search could generate that file if it changed though. I've used similar techniques checking whether I needed to rebuild static data repositories based on wget timestamps.
Cheetah is another tool which may be useful.
If it were me, I think I'd do full rebuilds.

Your problem seems tailor-made to treat it with autoconf and autoheader, writing the values of the variables into a config.h file. If that's not possible, consider reading the "-D" directives from a file and writing the flags into that file.
Under all circumstances, you have to avoid builds that depend on environment variables only. You have no way of telling when the environment changed. There is a definitive need to store the variables in a file, the cleanest way would be by autoconf, autoheader and a source and multiple build trees; the second-cleanest way by re-configure-ing for each switch of compile context; and the third-cleanest way a file containing all mutable compiler switches on which all objects dependant on these switches depend themselves.
When you choose to implement the third way, remember not to update this file unnecessarily, e.g. by constructing it in a temporary location and copying it conditionally on diff, and then make rules will be capable of conditionally rebuilding your files depending on flags.

One way to do this is to store each #define's previous value in a file, and use conditionals in your makefile to force update that file whenever the current value doesn't match the previous. Any files which depend on that macro would include the file as a dependency.
Here is an example. It will update file.o if either file.c changed or the variable COMPILE_WITHOUT_FOO is different from last time. It uses $(shell ) to compare the current value with the value stored in the file envvars/COMPILE_WITHOUT_FOO. If they are different, then it creates a command for that file which depends on force, which is always updated.
file.o: file.c envvars/COMPILE_WITHOUT_FOO
gcc -DCOMPILE_WITHOUT_FOO=$(COMPILE_WITHOUT_FOO) $< -o $#
ifneq ($(strip $(shell cat envvars/COMPILE_WITHOUT_FOO 2> /dev/null)), $(strip $(COMPILE_WITHOUT_FOO)))
force: ;
envvars/COMPILE_WITHOUT_FOO: force
echo "$(COMPILE_WITHOUT_FOO)" > envvars/COMPILE_WITHOUT_FOO
endif
If you want to support having macros undefined, you will need to use the ifdef or ifndef conditionals, and have some indication in the file that the value was undefined the last time it was run.

Jay pointed out that "make triggers on date time stamps on files".
Theoretically, you could have your main makefile, call it m1, include variables from a second makefile called m2. m2 would contain a list of all the preprocessor flags.
You could have a make rule for your program depend on m2 being up-to-date.
the rule for making m2 would be to import all the environment variables ( and thus the #include directives ).
the trick would be, the rule for making m2 would detect if there was a diff from the previous version. If so, it would enable a variable that would force a "make all" and/or make clean for the main target. otherwise, it would just update the timestamp on m2 and not trigger a full remake.
finally, the rule for the normal target (make all ) would source in the preprocessor directives from m2 and apply them as required.
this sounds easy/possible in theory, but in practice GNU Make is much harder to get this type of stuff to work. I'm sure it can be done though.

make triggers on date time stamps on files. A dependent file being newer than what depends on it triggers it to recompile. You'll have to put your definition for each option in a separate .h file and ensure that those dependencies are represented in the makefile. Then if you change an option the files dependent on it would be recompiled automatically.
If it takes into account include files that include files you won't have to change the structure of the source. You could include a "BuildSettings.h" file that included all the individual settings files.
The only tough problem would be if you made it smart enough to parse the include guards. I've seen problems with compilation because of include file name collisions and order of include directory searches.
Now that you mention it I should check and see if my IDE is smart enough to automatically create those dependencies for me. Sounds like an excellent thing to add to an IDE.

Related

CMake+git: check file sha rather than timestamp

As far as I know, CMake checks the time stamp of a source file to detect if it is outdated and needs to be rebuild (and with it, all files including it). When switching branches in a large git repository, this can causes problems.
Let's say I have one source folder and two build directories (build1 and build2), which correspond to two different branches (branch1 and branch2)
project
+-- src
+-- branch1_build
+-- branch2_build
Say my two branches have few differences, in few files; mostly, they only differ for some configuration option, all encapsulated in a config.h file, generated by the CONFIGURE_FILE command in cmake. The source files for the two config.h files (the config.h.in, as it is often called) are different. For instance, one branch introduces a new subfolder, which can be activated with a config-time option, which gets put in config.h.in with something like #cmakedefine HAVE_NEW_FEATURE_FOLDER. In such a scenario, when switching branches in the source folder, this happens: cmake recognizes that something changed in the config.h.in file, so it runs again; by running again, it generates a new config.h file; since config.h has a new time stamp, all files that includes it (directly or indirectly) end up being recompiled.
Now, if I alternatively switch between branch1 and branch2 in the source folder (cause I'm working on both branches every day), two consecutive make commands issued in the same build folder (either branch1_build or branch2_build) will trigger a full recompilation, since, although config.h has not changed in content, its time stamp has changed, so cmake flags it has changed.
My question is: what options do I have to avoid this? Or, better phrased, how can I avoid recompiling a source-build tree pair that is in fact unchanged since the last build, while also minimizing the changes to the source code?
The only solution I can think of is to execute CONFIGURE_FILE on config.h.in, with output config.h.tmp; compare config.h.tmp with config.h, and, only if different, copy config.h.tmp to config.h. However, this seems clumsy, and overcomplicated. I hoped cmake already had a mechanism for this, perhaps hidden under some options/variations of CONFIGURE_FILE...
Assuming this is not yet possible, I was wondering how complicated it would be for cmake to check the sha (rather than the timestamp) of a particular file when deciding whether it is outdated or not, and comparing it with the sha of a previous build (yes, the word outdated has date in it, but let's not get into enlish vocabulary discussions here). I imagine this is more expensive, so I would think that, if possible at all, this behavior should not be the default, and the user should use sparingly this feature, by explicitly tagging a file as check_sha_not_time kind of file. In the example above, the user would tag config.h as check_sha_not_time, and avoid recompilation of pretty much the whole library.
Note 0: I know little of how cmake internally works, so my suggestion of using sha rather than timestamp could be completely crazy and/or impossible given cmake implementation. I apologize for that. But that's why one asks things here, cause he/she doesn't know, right?
Note 1: I also tried using ccache, but unsuccessfully. Perhaps I need to use some particular flag or configuration option in ccache to trigger this capability.
Note 2: I want to avoid duplicating the source folder.

Setting and using path to data directory with GNU AutoTools

I am trying to use GNU AutoTools for my C++ project. I have written configure.ac, makefile.am etc. I have some files that are used by the program during execution e.g. template files, XML schema etc. So, I install/copy these files along the executable, for which I use something like:
abcdir = $(bindir)/../data/abc/
abc_DATA = ../data/knowledge/abc.cc
Now it copies the file correctly and My program installation structure looks somethings as follows:
<installation_dir>/bin/<executableFile>
<installation_dir>/data/abc/abc.cc
Now the problem is that in the source code I actually use these files (abc.cc etc.) and for that I need path of where these files resides to open them. One solution is to define (using AC_DEFINE) some variable e.g. _ABC_PATH_ that points to the path of installation but how to do that exactly?. OR is there any better way to do that. For example, in source code, I do something like:
...
ifstream input(<path-to-abc-folder> + "abc.cc"); // how to find <path-to-abc-folder>?
..
The AC_DEFINE solution is fine in principle, but requires shell-like variable expansion to take place. That is, _ABC_PATH_ would expand to "${bindir}/../data/abs", not /data/abc.
One way is to define the path via a -D flag, which is expanded by make:
myprogram_CPPFLAGS += -D_ABC_PATH='\"${abcdir}\"'
which works fine in principle, but you have to make include config.status in the dependencies of myprogram.
If you have a number of such substitution variables, you should roll out a paths.h file that is
generated by automake with a rule like:
paths.h : $(srcdir)/paths.h.in config.status
sed -e 's:#ABC_PATH#:${abcdir}:' $< > $#
As a side-note, you do know about ${prefix} and ${datarootdir} and friends, don't you? If not, better read them up; ${bindir}/.. is not necessarily equal to ${prefix} if the user did set ${exec_prefix}.

Does "make" know how to search sub-dirs for include files?

This is a question for experienced C/C++ developpers.
I have zero knowledge of compiling C programs with "make", and need to modify an existing application, ie. change its "config" and "makefile" files.
The .h files that the application needs are not located in a single-level directory, but rather, they are spread in multiple sub-directories.
In order for cc to find all the required include files, can I just add a single "-I" switch to point cc to the top-level directory and expect it to search all sub-dirs recursively, or must I add several "-I" switches to list all the sub-dirs explicitely, eg. -I/usr/src/myapp/includes/1 -I/usr/src/myapp/includes/2, etc.?
Thank you.
This question appears to be about the C compiler driver, rather than make. Assuming you are using GCC, then you need to list each directory you want searched:
gcc -I/foo -I/foo/bar myprog.c
This is actually a compiler switch, unrelated to make itself.
The compiler will search for include files in the built-in system dirs, and then in the paths you provide with the -I switch. However, no automatic sub-directory traversal is made.
For example, if you have
#include "my/path/to/file.h"
and you give -I a/directory as a parameter, the compiler will look for a/directory/my/path/to/file.h.
If the makefiles are written in the usual way, the line that invokes the compiler will use a couple of variables that allow you to customize the details, e.g. not
gcc (...)
but
$(CC) $(CFLAGS) (...)
and if this is the case, and you're lucky, you don't even need to edit any of the makefiles; instead you can invoke make like this
make CFLAGS='-I /absolute-path/to/wherever'
to incorporate your special options into the compiler invocation.
Also check whether the Makefiles aren't generated by something else (usually, a script in the top directory called
configure
which will have options of its own to control what goes into them).
everyone answered your question correctly. but something to consider when you get to setup your own source tree.... a leaf node should only look 2 places for headers, in its own directory or up the tree. once people start going across to peers and down the tree, the build system will get gnarly, but what also happens is folks with start using private interfaces when they should be using public interfaces

Help with rake dependency mapping

I'm writing a Rakefile for a C++ project. I want it to identify #includes automatically, forcing the rebuilding of object files that depend on changed source files. I have a working solution, but I think it can be better. I'm looking for suggestions for:
Suggestions for improving my function
Libraries, gems, or tools that do the work for me
Links to cool C++ Rakefiles that I should check out that do similar things
Here's what I have so far. It's a function that returns the list of dependencies given a source file. I feed in the source file for a given object file, and I want a list of files that will force me to rebuild my object file.
def find_deps( file )
deps = Array.new
# Find all include statements
cmd = "grep -r -h -E \"#include\" #{file}"
includes = `#{cmd}`
includes.each do |line|
dep = line[ /\.\/(\w+\/)*\w+\.(cpp|h|hpp)/ ]
unless dep.nil?
deps << dep # Add the dependency to the list
deps += find_deps( dep )
end
end
return deps
end
I should note that all of my includes look like this right now:
#include "./Path/From/Top/Level/To/My/File.h" // For top-level files like main.cpp
#include "../../../Path/From/Top/To/My/File.h" // Otherwise
Note that I'm using double quotes for includes within my project and angle brackets for external library includes. I'm open to suggestions on alternative ways to do my include pathing that make my life easier.
Use the gcc command to generate a Make dependency list instead, and parse that:
g++ -M -MM -MF - inputfile.cpp
See man gcc or info gcc for details.
I'm sure there are different schools of thought with respect to what to put in #include directives. I advise against putting the whole path in your #includes. Instead, set up the proper include paths in your compile command (with -I). This makes it easier to relocate files in the future and more readable (in my opinion). It may sound minor, but the ability to reorganize as a project evolves is definitely valuable.
Using the preprocessor (see #greyfade) to generate the dependency list has the advantage that it will expand the header paths for you based on your include dirs.
Update: see also the Importing Dependencies section of the Rakefile doc for a library that reads the makefile dependency format.

Can I have one makefile to build a hierarchical project?

I have several hundred files in a non-flat directory structure. My Makefile lists each sourcefile, which, given the size of the project and the fact that there are multiple developers on the project, can create annoyances when we forget to put a new one in or take out the old ones. I'd like to generalize my Makefile so that make can simply build all .cpp and .h files without me having to specify all the filenames, given some generic rules for different types of files.
My question: given a large number of files in a directory with lots of subfolders, how do I tell make to build them all without having to specify each and every subfolder as part of the path? And how do I make it so that I can do this with only one Makefile in the root directory?
EDIT: this almost answers my question, but it requires that you specify all filenames :\
I'm sure a pure-gmake solution is possible, but using an external command to modify the makefile, or generate an external one (which you include in your makefile) is probably much simpler.
Something along the lines of:
all: myprog
find_sources:
zsh -c 'for x in **/*.cpp; echo "myprog: ${x/.cpp/.o}" >> deps.mk'
include deps.mk
and run
make find_sources && make
note: the exact zsh line probably needs some escaping to work in a make file, e.g. $$ instead of $. It can also be replaced with bash + find.
One way that would be platform independent (I mean independent from shell being in Windows or Linux) is this:
DIRS = relative/path1\
relative/path2
dd = absolute/path/to/subdirectories
all:
#$(foreach dir, $(DIRS), $(MAKE) -C $(dd)$(dir) build -f ../../Makefile ;)
build:
... build here
note that spaces and also the semicolon are important here, also it is important to specify the absolute paths, and also specify the path to the appropriate Makefile at the end (in this case I am using only one Makefile on grandparent folder)
But there is a better approach too which involves PHONY targets, it better shows the progress and errors and stops the build if one folder has problem instead of proceeding to other targets:
.PHONY: subdirs $(DIRS)
subdirs: $(DIRS)
$(DIRS):
$(MAKE) -C $# build -f ../../Makefile
all : prepare subdirs
...
build :
... build here
Again I am using only one Makefile here that is supposed to be applicable to all sub-projects. For each sub-project in the grandchild folder the target "build" is created usinf one Makefile in the root.
I would start by using a combination of the wildcard function:
http://www.gnu.org/software/make/manual/make.html#Wildcard-Function
VPATH/vpath
http://www.gnu.org/software/make/manual/make.html#Selective-Search
and the file functions
http://www.gnu.org/software/make/manual/make.html#File-Name-Functions
For exclusion (ie: backups, as Jonathan Leffler mentioned), use a seperate folder not in the vpath for backups, and use good implicit rules.
You will still need to define which folders to do to, but not each file in them.
I'm of two minds on this one. On one hand, if your Make system compiles and links everything it finds, you'll find out in a hurry if someone has left conflicting junk in the source directories. On the other hand, non-conflicting junk will proliferate and you'll have no easy way of distinguishing it from the live code...
I think it depends on a lot of things specific to your shop, such as source source control system and whether you plan to ever have another project with an overlapping code base. That said, if you really want to compile every source file below a given directory and then link them all, I'd suggest simple recursion: to make objects, compile all source files here, add the resultant objects (with full paths) to a list in the top source directory, recurse into all directories here. To link, use the list.