Automake rebuild source file if Makefile changes - c++

When using an autoconf/automake build system if the compiler flags or other variables in a Makefile.am (or even higher level like configure.ac) change, the C++ source files associated with that Makefile will not be automatically rebuilt. This becomes especially important as we use automake as part of a continuous build system that only recompiles as needed.
My thought was to include Makefile as a dependency for the .o files which would theoretically solve the above issue. So a couple of questions:
First, is it possible to add a rule like that? I would prefer to not have to add that custom rule to every single Makefile.am, so something that could be placed into a top-level file (like configure.ac) would be great.
Second, the downside to this approach is that in some cases the change to the Makefile did not actually affect the compilation so I will end up rebuilding when it is not really needed. I guess I'm willing to live with this (or at least try it to see how painful it is) to have a better guarantee that my builds will be correct, but is there a better way to solve this problem? I believe clearmake solves this by saving the actual compiler command (along with other dependencies) then comparing the current command with the previous to determine if a file needs to be regenerated.

If you use ccache (./configure CXX='ccache g++', or just add ccache's g++ to the path), spurious rebuilds should be very cheap and still safe. Also make sure never to use the AM_MAINTAINER_MODE autoconf macro, which makes dependency tracking optional (conditional on the --enable-maintainer-mode flag).

Related

How ccache improves building speed?

I am using cmake to setup my project and when I change a file in a project, I found my cmake knows to only recompile the changed file and then relink everything together for the final executable/lib.
I then read through the documentation about ccache, what I don't understand is: what is the difference between ccache's approach (that uses hash value to check if file is changed and needs recompile) and the default approach that the cmake uses (or there might be something else rather than cmake checks the file updates, but you know what I mean here). Maybe the PCH part is different, but cmake 3.18 now comes with PCH support, so, does that mean the benefits ccache provides on the PCH part is no longer unique?
Consider the case where you switch to some older branch of your project - that you did compile in the past and that ccache has cached, but CMake sees as "almost all files have changed and must be recompiled" - that's where you see a massive gain.
Another situation is where you have deleted your build directory (for some good reason) and now have to rebuild everything. ccache is also a huge help there.
Also; ccache is trivial to set up and is thenceforth completely invisible / transparent, so there really is no reason to not use it. When it helps it usually helps a lot, when it does not help it doesn't hurt.
cmake/gmake and ccache are not exclusive to each other. They are typically used together.
ccache comes into play when the entire source tree needs to be rebuilt for some reason. cmake/gmake rebuilds only changed files, but there are situations where the entire source tree needs to be recompiled. And if this happens repeatedly, ccache will wake up and short-circuit the compiler. C++ compilers are notorious for being slow, and this often helps quite a bit.
Just a couple of examples: when you need to repeatedly switch between building with and without optimizations, repeatedly. cmake/gmake won't help you when you edit the makefile and adjust the compilation flags. None of the source files actually changed, so cmake/gmake doesn't think there's anything to do, so you must explicitly make clean and recompile from scratch.
If you are doing it repeatedly, ccache will avoid having to run the compiler on the entire source code, and will simply fetch out the appropriate object module instead of compiling the source from scratch.
Another common situation is when you're running a script to prepare an installable package for your code. This typically involves using an implementation-specific tool to rebuild the source code, from scratch, into an installable package.

Is there a way to perform atomic CMake build?

I'm considering reimplementing our build system (currently based on GNU Make) in CMake.
Disclaimer: this is more of a theoretical and "best practices" question. I don't know CMake in-depth. Also, please feel free to migrate the question to programmers if it's more on-topic there.
As far as I understand, the standard workflow for CMake is
cmake .
make
I suspect there may be problems of de-synchronization of CMake files and Makefiles.
So, during usual development process you're supposed to run make to avoid unnecessary rebuilds of CMakeCache and Makefiles and generally make the process more straight-forward. But then, if you add, say, a new source file to CMakeLists and run make, it'll be using old CMakeCache and Makefiles and will not regenerate them automatically. I think it may cause major problems when used at scale, since in case something is not building as it should, you'll have to try to perform make clean, then, if it doesn't help, you'll need to remove CMakeCache and regenerate everything (manually!).
If I'm not right about something of the above, please correct me.
I'd like to just do
awesome-cmake
and have it update everything what needs updating and build the project.
So, the question: is there a way to make "atomic build" with CMake so that it tracks all the required information and abstracts away the usage of make?
I think you have a couple of incorrect ideas here:
I suspect there may be problems of de-synchronization of CMake files and Makefiles.
Ultimately, CMake is all about producing correct Makefiles (or Visual Studio solution files, or XCode project files, or whatever). Unless you modify a generated Makefile by hand, there can be no synchronisation issue between CMake and the Makefile since CMake generates the Makefile.
But then, if you add, say, a new source file to CMakeLists and run make, it'll be using old CMakeCache and Makefiles and will not regenerate them automatically.
Actually, the opposite is true: if you modify the CMakeLists.txt (e.g. adding a new source, changing a compiler flag, adding a new dependency) then running make will trigger a rerun of CMake automatically. CMake will read in its previously cached values (which includes any command line args previously given to CMake) and generate an updated Makefile.
in case something is not building as it should, you'll have to try to perform make clean, then, if it doesn't help, you'll need to remove CMakeCache and regenerate everything (manually!).
Yes, this would be a pretty normal workflow if something has gone wrong. However, things don't often get that bad in my experience.
So, the question: is there a way to make "atomic build" with CMake so that it tracks all the required information and abstracts away the usage of make?
Given that running make will cause CMake to "do the right thing", i.e. rerun if required, I guess that using make is as close to an "atomic build" as possible.
One thing to beware of here is the use of file(GLOB ...) or similar to generate a list of source files. From the docs:
We do not recommend using GLOB to collect a list of source files from your source tree. If no CMakeLists.txt file changes when a source is added or removed then the generated build system cannot know when to ask CMake to regenerate.
In other words, if you do use file(GLOB ...) to gather a list of sources, you need to get into the habit of rerunning CMake after adding/removing a file from your source tree; running make won't trigger a rerun of CMake in this situation.
The standard workflow for CMake is an out of source build
mkdir build
cd build
cmake ..
make

When should I delete object files before compiling?

In compile scripts I generally see a call to always delete object files before compiling. Does this slow down the build process? Is it really necessary with compilers that check if the object files are out of date when deciding to recompile them?
Sometimes if you revert a source file to a previous version, the .o will have a newer date that its source, and therefore won't be fed to the compiler. If you had a reason to revert the source file, you almost surely want the object rebuilt. Doing a clean build ensures you get what you think you're getting.
In some way, deleting existing object files defeats the purpose of having separate translation units in the first place. Your standard build environment should normally only rebuild those object files which are older than the corresponding source file. (You don't need to delete, you can just overwrite.)
If you have a decent revision control system, then even checking out an older version of a modified file will make the actual file on disk have a current timestamp, but indeed, if you're worried that something might be inconsistent, you can always clean up the entire build tree and start over. But as a matter of normal code writing, it would appear terribly wasteful to delete object files.
You should of course keep one set of object files for each set of build options (e.g. debug vs release). Some build environments allow you to have multiple output directories, others (like cmake) will just automatically rebuild everything if you change the global build settings, but that's something to watch out for, especially if you just add some #defines to the compiler flags in the middle of a build process.
1) Yes, and 2) no, but it's not the compiler that watches out for that, it's the build system (an IDE or well-written makefile).

Automatic build ID

We're looking for a way to include some sort of build ID automatically in our builds. This needs to be portable (VC++, g++ on Linux and Mac) and automatic. VC++ is what matters most, since in the other environments we use custom Python build scripts so I can do whatever I want.
We use SVN, so we were looking at using the output of svnversion to write the revision to a header and include it. This has problems : if we put the file in SVN, it will appear as modified every time, but it would be a superfluous commit and in a sense generate an infinite loop of increasing revisions. If we don't put the file in SVN and just create it as a pre-build step, the sources wouldn't be complete, as they'd need the pre-build step or Makefile to generate that file.
We could also use __DATE__ but we can't guarantee the file that uses the __DATE__ (ie writes it to a log file) will be compiled if some other file is modified - except if we "touch" it, but then we'd cause the project to be always out of date. We could touch it as the pre-build step, so it would get touched only if the rest of the project is out of date, thus not causing a spurious compile, but if VC++ computes the dependencies before the pre-build step, this wouldn't work (the file with __DATE__ won't get compiled)
Any interesting ideas?
We're using the output of svnversion, written to a header file and included. We omit the file from the repository and create it in a pre-build step; this has worked quite well for us. (I'm not sure why you object to using a pre-build step?)
We're currently using a Perl script to convert svnversion's output into a header file; I later found out that TortoiseSVN includes a subwcrev command (which has also been ported to Linux) that can do much of the same thing.
If you don't like the idea of an include file not in source control that is required for a build, consider a batch file or other build step that programmatically creates a file/include and call the svnversion within your build process.
basically GENERATE the file so you don't have an unversioned and required file.
EDIT
Josh's subwcrev is probably the best idea.
Before that was implemented I wrote my own hacky tool to do the same thing - do replacement in a template file.
It could be as simple as:
% make -DBUILD_NUMBER=`svnlook youngest /path/to/repo`
I'd look at SvnRev. You can use it as a custom pre-build step in VS, or call it from a makefile, or whatever else you need to do, and it generates a header file that you can include in your other files that will give you what you need. There's good documentation on the site.
SubWCRev is another option, though the Linux port is newer, and I don't know that a Mac version exists. It's very useful on Windows for .NET (which I'm guessing isn't an issue for you, but I'm adding this for future reference), because it allows you to create a template file that can be used to generate, for example, the Properties file for a .NET assembly.
Automatic builds can typically be full, clean builds. In that case, you start in a clean directory and there would be no issue with __DATE__ in any case. Otherwise, see Paul Beckinham's idea.
Why not tie a GUID to it, almost every language has support for generating one, or if your's doesn't there are alot of algorithms for that around.
(Although, if you do use subversion, I personally like Josh's idea better!)

Is there any way to prevent Boost.Build from recursively scanning header files for #include directives?

Is there a way to limit the header files that Boost.Build recursively scans for #include directives to a particular directory or set of directories? I.e. I'd like it to recursively scan the header files within my project only. I know that they external dependencies are not going to change (and being Boost and Qt they're pretty big). I end up with around 50,000 targets in the dependency tree which takes a while to process (resulting in a 1-2 minute build time even if no files have actually changed).
The only solution I've found so far is to take advantage of the INCLUDE environment variable (I'm using MSVC) - this means Boost.Build need not be informed of the include paths (I'm using the feature) and hence will not scan them. This seems a bit of a hack.
I feel like I must be missing something obvious because I haven't been able to find other people experiencing similar problems, even though I ran into this almost immediately. The closest I've come is here.
Judging from the debug output (bjam -d 3) it also scans most of the header files more than once... I don't know if this means that they are added as dependencies more than once, but certainly the cost of loading a file and scanning the entire contents must add up?
If I could tell it not to bother scanning a particular directory or set of directories in which I can guarantee the header files are not going to change, that would be perfect.
This question was also posted on the Boost mailing list and we got an answer to it here: http://lists.boost.org/boost-build/2009/04/21734.php.
So it seems so far that the answer is that, at least out of the box, Boost.Build doesn't have this feature, and the solution is to customise Boost.Build to your needs, which makes a certain amount of sense.
However, I am still curious as to why this is not a more common problem for people. I see that caching the dependencies would reduce the time, but surely if we scan all external libraries we end up with a huge dependency tree, much of it redundant? When I'm working on a project, I'm not going to change the third party libraries very often at all, it seems a shame to pay for dependency checking on them.
You might want to check out alternative build tools like SCons.
SCons has a mode --implicit-cache where it caches implicit dependencies. That should help in the scenario you described.
Here's an extract from the man page.
--implicit-cache
Cache implicit dependencies. This causes scons to use the implicit (scanned) dependencies from the last time it was run instead of scanning the files for implicit dependencies. This can significantly speed up SCons, but with the following limitations:
scons will not detect changes to implicit dependency search paths (e.g. CPPPATH, LIBPATH) that would ordinarily cause different versions of same-named files to be used.
scons will miss changes in the implicit dependencies in cases where a new implicit dependency is added earlier in the implicit dependency search path (e.g. CPPPATH, LIBPATH) than a current implicit dependency with the same name.
--implicit-deps-changed
Forces SCons to ignore the cached implicit dependencies. This causes the implicit dependencies to be rescanned and recached. This implies --implicit-cache.
--implicit-deps-unchanged
Force SCons to ignore changes in the implicit dependencies. This causes cached implicit dependencies to always be used. This implies --implicit-cache.