I am using cmake to setup my project and when I change a file in a project, I found my cmake knows to only recompile the changed file and then relink everything together for the final executable/lib.
I then read through the documentation about ccache, what I don't understand is: what is the difference between ccache's approach (that uses hash value to check if file is changed and needs recompile) and the default approach that the cmake uses (or there might be something else rather than cmake checks the file updates, but you know what I mean here). Maybe the PCH part is different, but cmake 3.18 now comes with PCH support, so, does that mean the benefits ccache provides on the PCH part is no longer unique?
Consider the case where you switch to some older branch of your project - that you did compile in the past and that ccache has cached, but CMake sees as "almost all files have changed and must be recompiled" - that's where you see a massive gain.
Another situation is where you have deleted your build directory (for some good reason) and now have to rebuild everything. ccache is also a huge help there.
Also; ccache is trivial to set up and is thenceforth completely invisible / transparent, so there really is no reason to not use it. When it helps it usually helps a lot, when it does not help it doesn't hurt.
cmake/gmake and ccache are not exclusive to each other. They are typically used together.
ccache comes into play when the entire source tree needs to be rebuilt for some reason. cmake/gmake rebuilds only changed files, but there are situations where the entire source tree needs to be recompiled. And if this happens repeatedly, ccache will wake up and short-circuit the compiler. C++ compilers are notorious for being slow, and this often helps quite a bit.
Just a couple of examples: when you need to repeatedly switch between building with and without optimizations, repeatedly. cmake/gmake won't help you when you edit the makefile and adjust the compilation flags. None of the source files actually changed, so cmake/gmake doesn't think there's anything to do, so you must explicitly make clean and recompile from scratch.
If you are doing it repeatedly, ccache will avoid having to run the compiler on the entire source code, and will simply fetch out the appropriate object module instead of compiling the source from scratch.
Another common situation is when you're running a script to prepare an installable package for your code. This typically involves using an implementation-specific tool to rebuild the source code, from scratch, into an installable package.
Related
Is there a way to avoid full recompile after you checkout a branch, do some editing, then checkout the branch you were on previously?
It looks like build system detects that files have been swapped around and demands full recompilation despite the fact that those are the same files that you compiled previously. Any way to avoid that?
UPD: I should probably point out that I am using Visual C++ compiler.
You did not specificy what kind of branch you are checking out. In case you are checking out one vastly different compared to your initial one, e.g. master vs gh-pages on Github, the timestamps on the source files will be newer than the corresponding binary files. In that case, the following should help:
1) If you are using a GNU make based build system, execute make -t. This marks all targets as up-to-date by setting their modification timestamps to the current time.
2) ccache can decrease the recompile time of exactly the same source code by magnitudes. At least on Linux it is also extremely easy to set up and use.
I'm wondering if there's a macro or a simple way to let the compiler increment either major, minor or revision of my code each time when I compile?
By the way I'm using the ARM compiler and uVision from Keil.
To set the version is not a compiler topic. This should be done in connection with a source code control / version control system like cvs/svn/git or others. Your build id should be connected to the content of your source code database to get reproducible builds from checkouts from your version control system. Or, if your code is not already committed to your database a dirty-tag should be provided and compiled in to give the user of the software a chance to see that this is not a controlled version.
Simply counting a value in a variable can be done by a Makefile or in pre- and post-build instructions which depends on the used IDE. Sorry, for keil I have no experience...
Define Post-Build Event to run small external program. The program has to modify specific .h file. In the header file define macros like VER_MAJOR, VER_MINOR, VER_BUILD. Date/time string can also be updated. I use this method and can control the version numbers as I wish.
IMO, you do not need to do this, especially increase a number every time you compile your code.
Set the major and minor revision manually in a header file; you should not have to do this often.
Build number should only be related to the source control revision number (i.e. you should be able to build and rebuild any revision under source control).
Imagine if you are a team of 5 developers, and everyone build and rebuild on their side what is the actual build number?
do they all update a header file ?
who is responsible of owning that header file ?
Some compilers do support features like "post build", which runs a program of your choice after compiling, but that would be tricky if your program is built from multiple source files. Not all compilers do though.
For that reason, I wouldn't actually do this sort of thing via the compiler. I'd do it in the build script (e.g. makefile) or by configuring the build settings in your IDE.
Assuming you're using make or similar, add a target something like setversion (you pick the name) that runs a program which modifies a header file which specifies the components of your version number. So typing make setversion will update your version numbers.
Optionally, that target can also - after updating the version numbers - do a make clean (i.e. delete all object files and executables) and make all (to recompile and link everything).
I also suggest avoiding changing version numbers after each recompile. Imagine you're busily testing and debugging code, and go through several rebuild cycles. Do you really want the version number updated every time you recompile even one source file? It can be done that way if you choose, but will make every rebuild take longer (and, in projects with multiple source files) you will need to take care if you want to preserve capability for incremental builds.
In compile scripts I generally see a call to always delete object files before compiling. Does this slow down the build process? Is it really necessary with compilers that check if the object files are out of date when deciding to recompile them?
Sometimes if you revert a source file to a previous version, the .o will have a newer date that its source, and therefore won't be fed to the compiler. If you had a reason to revert the source file, you almost surely want the object rebuilt. Doing a clean build ensures you get what you think you're getting.
In some way, deleting existing object files defeats the purpose of having separate translation units in the first place. Your standard build environment should normally only rebuild those object files which are older than the corresponding source file. (You don't need to delete, you can just overwrite.)
If you have a decent revision control system, then even checking out an older version of a modified file will make the actual file on disk have a current timestamp, but indeed, if you're worried that something might be inconsistent, you can always clean up the entire build tree and start over. But as a matter of normal code writing, it would appear terribly wasteful to delete object files.
You should of course keep one set of object files for each set of build options (e.g. debug vs release). Some build environments allow you to have multiple output directories, others (like cmake) will just automatically rebuild everything if you change the global build settings, but that's something to watch out for, especially if you just add some #defines to the compiler flags in the middle of a build process.
1) Yes, and 2) no, but it's not the compiler that watches out for that, it's the build system (an IDE or well-written makefile).
When using an autoconf/automake build system if the compiler flags or other variables in a Makefile.am (or even higher level like configure.ac) change, the C++ source files associated with that Makefile will not be automatically rebuilt. This becomes especially important as we use automake as part of a continuous build system that only recompiles as needed.
My thought was to include Makefile as a dependency for the .o files which would theoretically solve the above issue. So a couple of questions:
First, is it possible to add a rule like that? I would prefer to not have to add that custom rule to every single Makefile.am, so something that could be placed into a top-level file (like configure.ac) would be great.
Second, the downside to this approach is that in some cases the change to the Makefile did not actually affect the compilation so I will end up rebuilding when it is not really needed. I guess I'm willing to live with this (or at least try it to see how painful it is) to have a better guarantee that my builds will be correct, but is there a better way to solve this problem? I believe clearmake solves this by saving the actual compiler command (along with other dependencies) then comparing the current command with the previous to determine if a file needs to be regenerated.
If you use ccache (./configure CXX='ccache g++', or just add ccache's g++ to the path), spurious rebuilds should be very cheap and still safe. Also make sure never to use the AM_MAINTAINER_MODE autoconf macro, which makes dependency tracking optional (conditional on the --enable-maintainer-mode flag).
I've heard that enabling Link-Time Code Generation (the /LTCG switch) can be a major optimization for large projects with lots of libraries to link together. My team is using it in the Release configuration of our solution, but the long compile-time is a real drag. One change to one file that no other file depends on triggers another 45 seconds of "Generating code...". Release is certainly much faster than Debug, but we might achieve the same speed-up by disabling LTCG and just leaving /O2 on.
Is it worth it to leave /LTCG enabled?
It is hard to say, because that depends mostly on your project - and of course the quality of the LTCG provided by VS2005 (which I don't have enough experience with to judge). In the end, you'll have to measure.
However, I wonder why you have that much problems with the extra duration of the release build. You should only hand out reproducible, stable, versioned binaries that have reproducible or archived sources. I've rarely seen a reason for frequent, incremental release builds.
The recommended setup for a team is this:
Developers typically create only incremental debug builds on their machines. Building a release should be a complete build from source control to redistributable (binaries or even setup), with a new version number and labeling/archiving the sources. Only these should be given to in-house testers / clients.
Ideally, you would move the complete build to a separate machine, or maybe a virtual machine on a good PC. This gives you a stable environment for your builds (includes, 3rd party libraries, environment variables, etc.).
Ideally, these builds should be automated ("one click from source control to setup"), and should run daily.
It allows the linker to do the actual compilation of the code, and therefore it can do more optimization such as inlining.
If you don't use LTCG, the compiler is the only component in the build process that can inline a function, as in replace a "call" to a function with the contents of the function, which is usually a lot faster. The compiler would only do so anyway for functions where this yields an improvement.
It can therefore only do so with functions that it has the body of. This means that if a function in the cpp file calls another function which is not implemented in the same cpp file (or in a header file that is included) then it doesn't have the actual body of the function and can therefore not inline it.
But if you use LTCG, it's the linker that does the inlining, and it has all the functions in all the of the cpp files of the entire project, minus referenced lib files that were not built with LTCG. This gives the linker (which becomes the compiler) a lot more to work with.
But it also makes your build take longer, especially when doing incremental changes. You might want to turn on LTCG in your release build configuration.
Note that LTCG is not the same as profile-guided optimization.
I know the guys at Bungie used it for Halo3, the only con they mentioned was that it sometimes messed up their deterministic replay data.
Have you profiled your code and determined the need for this? We actually run our servers almost entirely in debug mode, but special-case a few files that profiled as performance critical. That's worked great, and has kept things debuggable when there are problems.
Not sure what kind of app you're making, but breaking up data structures to correspond to the way they were processed in code (for better cache coherency) was a much bigger win for us.
I've found the downsides are longer compile times and that the .obj files made in that mode (LTCG turned on) can be really massive. For example, .obj files that might be 200-500k were about 2-3mb. It just to happened to me that compiling a bunch of projects in my chain led to a 2 gb folder, the bulk of which was .obj files.
I also don't see problems with extra compilation time using link-time code generation with the release build. I only build my release version once per day (overnight), and use the unit-test and debug builds during the day.