GCC Dependency Tracking: Is -M better than -MM?

GCC Dependency Tracking: Is -M better than -MM? - c++

There are basically two options for tracking dependencies which are -M and -MM. The difference is that -MM omits system headers and headers included by them.
My question: Why would anyone want to use -M? It inflates the generated .d files drastically, since a system header usually includes a large pack of other system headers. In addition, system headers cannot be built by make, so having them as a depencies yields no benefit. The only little benefit I could see is that - if a required system header is missing - make reports the missing header instead of gcc reporting it. But what is the benefit of this?
To sum things up, I see no reason why -M would be useful at all. Am I missing something? Which scenarios are there that require one to use -M over -MM.

Most header files can't be "built" by make. They're listed as prerequisites so that if they change, then the source code that relies on them is rebuilt. For example, if you install security fix packages on your system and they modify one of the system headers you use, you may want to be sure all your code is rebuilt. These days the backward-compatibility of most base libraries is such that this is not really needed most of the time, I agree.
Also, if you're cross-compiling then your "system" header files are provided to you from the cross-target; these headers might be for an embedded system or similar, and may change (in non-backward-compatible ways) more often than a standard system.

Why would anyone want to use -M?
If the system headers change you want a rebuild to operate accordingly. That is, if your code uses a header, and that header changes, your code should rebuild even if your code didn't change.
Listing headers as dependencies is rarely about 'building' those headers. System headers are no different.

There could be a number of reasons.
Rebuild only necessary parts after system libraries update. Rare thing, but someone may need it.
Get full dependencies list, for some reasons - maybe just drawing dependency graph?
Generate ctags for all potentially used header files.
...
Personally I use it for ctags, so it is not all hypothetical examples.

Related

Clang precompiled headers - working with different /usr/include timestamps - perhaps by editing metadata?

I've been trying to address a compile time problem. The infrastructure in question compiles multiple objects each of which uses a multitude of stdlib/boost. I've essentially hit a limit where simplifying the dependency tree is no longer worth the effort.
So, I tried precompiled headers - and it worked a treat! The problem I have now is fitting it in a large compute farm and CI. Specifically, not all machines were setup at the same time so the timestamp for /usr/include/ is often different.
The flow we would like to have is:
build certain shared libraries first
precompile header
Launch multiple jobs on different machines using shared libraries (fine) and precompiled header
The header is precompiled in the following way :
clang++ precompiled.hpp -o /<path>/precompiled.hpp.pch
When I use the precompiled header, depending on the timestamp of /usr/include/ on the given machine, i get the following metadata error :
fatal error: file '/usr/include/math.h' has been modified since the
precompiled header '//precompiled.hpp.pch' was built
It may sometimes be a different header too - eg assert.h is a common one.
So far I've tried the following:
changing isysroot & using glibc - exposed a variety of different problems (so a can of warms I'd rathern ot yet open)
hack by copying /usr/include/ elsewhere and specifying that earlier in the search path. Unfortunately, doesn't work due to use of include_next in some headers but not others i.e. can't consistently force the headers to be picked from elsewhere and none from /usr/include
Any ideas on how to tackle this problem?
I am now even considering an even worse hack - trying to edit the metadata of the precompiled header. Unfortunately, I couldn't find any API to easily query/edit the PCH.
Any ideas?

Have now managed to come to a solution (probably beneficial longterm anyway in terms of stability even if ignoring precompiled headers).
Specify --no-standard-includes -nostdinc++ -nostdlibinc. This will ensure that the path is include path is stripped of includes due to the way gcc/clang was built. May also specify CPATH= and CPLUS_INCLUDE_PATH=.
Reconstruct path using a central location via CPLUS_INCLUDE_PATH. This will mean the headers from /usr/include always come from a central locations and the metacheck will pass. It should also help stability of builds
Link against correct version of stdlib

Strategy to omit unused boost src files while shipping source code

I'm using
#include <boost/numeric/ublas/matrix.hpp>
in fact that's the only boost file I've included. Now I want to ship the source code and I was hoping not have to include all hundreds of MBs of boost_1_67_0.
How to deal with this issue?

This is simply something you would add to the list of build-dependencies of your C++ source code.
This kind of dependency could be made technically "bound" to your source code distribution via your version control system. In Git, for example, you could link to certain Boost libraries via a sub-module that links to their official git mirrors (github.com/boostorg as of this writing). When cloning your repository, it would then be an option to take in the Boost libraries at the same time.
Though, taking the size of the Boost headers into consideration, having them installed as a system-wide library, might be less complicated. Tools like CMake can help you write the logic for header-inclusion so you can support different header locations.
Of course, if what you seek is to create a fully isolated copy of your source code, the approach to bake all code into one massive header-file might be an option as well (but it should not be necessary).

You can preprocess the one header file you need, which will expand all its #includes:
c++ -E /usr/include/boost/numeric/ublas/matrix.hpp -o boost_numeric_ublas_matrix.hpp
Be aware though: this will expand even your system header files, so it assumes your users will build on the same platform. If they might compile on different platforms, you should simply omit the Boost code from your project and let the users install it themselves in whatever manner they choose.

uint32_t does not name a type

I have shared code given to me that compiles on one linux system but not a newer system. The error is uint32_t does not name a type. I realize that this is often fixed by including the <cstdint> or stdint.h. The source code has neither of these includes and I am trying to seek an option that doesn't require modifying due to internal business practices that I can't control. Since it compiles as is on one machine they don't want changes to the source code.
I am not sure if it matters but the older system uses gcc 4.1 while the newer one uses gcc 4.4. I could install different versions of gcc if needed, or add/install library/include files on the newer machine, I have full control of what is on that machine.
What are my options for trying to compile this code on my machine without modifying the source? I can provide other details if needed.

I am not sure if it matters but the older system uses gcc 4.1 while the newer one uses gcc 4.4
GCC stopped including <stdint.h> some time ago. You now have to include something to get it...
I realize that this is often fixed by including the <cstdint> or stdint.h. The source code has neither of these includes and I am trying to seek an option that doesn't require modifying due to internal business practices that I can't control...
I hope I am not splitting hairs... If you can't modify the source files, then are you allowed to modify the build system or configuration files; or the environment? If so, you can use a force include to insert the file. See Include header files using command line option?
You can modify Makefile to force include stdint.h. If the build system honors CFLAGS or CXXFLAGS, then you can force include it in the flags. You last choice is probably to do something like export CC="gcc -include stdint.h".
The reason I am splitting hairs is OpenSSL and FIPS. The OpenSSL source files for the FIPS Object Module are sequestered and cannot be modified. We have to fallback to modifying supporting scripts and the environment to get some things working as expected.

If you really don't want to amend the file you could wrap it. Suppose it's called src.c create a new file src1.c:
#include <stdint.h>
#include "src.c"
And then compile src1.c.
PS: The problem may arise because compilers include other headers in their header files. This can mean some symbols 'officially' defined in other headers are quietly defined when you include a header that isn't specified as including it.
It's an error to write a program relying on a symbol for which the appropriate header hasn't been included - but it's easy to do and difficult to spot.
A changing compiler or version sometimes reveals these quiet issues.

Unfortunately, you can't force your code to work on a newer compiler without modifying something.
If you are allowed to modify the build script and add source files to the project, you might be able to add another source file to the project which, in turn, includes your affected file and headers it really needs. Remove the affected source files from the build, add the new ones, and rebuild.
If your shared source is using macro magic (e.g. an #include SOME_MACRO, where SOME_MACRO can be defined on the command line), you might be able to get away with modifying build options (to define that macro for every compilation of each file). Apart from relying on modifying the build process, it also relies on a possible-but-less-than-usual usage of macros in your project.
It may be possible to modify the standard headers in your compiler/library installation - assuming you have sufficient access (administrative) to do so. The problem with this is that the problem will almost certainly re-emerge whenever an update/patch to the compiler/library is installed. Over time, this approach will lock the code into relying on an older and older compiler/library that has been superseded - no ability to benefit from compiler bug fixes, evolution of standards, etc. This also severely limits your ability to share the code, and ability of others to use it - anyone who receives the code needs to modify their compiler/library installation.
The basic fact, however, is that your shared code relies on a particular implementation (compiler/library) that exhibits non-standard behaviour. Hence it has failed with an update of that implementation - which removed those non-standard occurrences - it is likely to fail with other implementations (porting to different compilers in future, etc). The real technical solution is to modify the source, and #include needed headers correctly. The real business solution is to make a business case justifying the need for such modifications, citing inefficiency - which will grow over time - in terms of cost and effort needed to maintain the shared code whenever it needs to be ported, or whenever a compiler is updated.

look at the second last line of code above your error, you'll find everything above that terminates with a , and only use a ; on the last entery

What are the pros and cons of specifying an include prefix in the source file versus in the search path parameter of the compiler?

When a C or C++ library comes with several headers, they are usually in a specific folder. For example, OpenCV provides cv.h and highgui.h in a opencv folder.
The most common way to include them is to add this opencv folder to the search path of the pre-compiler (e.g. gcc -I/pathto/opencv), and simply include the headers by their filename in the source file (e.g. #include <cv.h>)
There is an alternative, as it is quite frequent that folders containing headers are with others (e.g. in /usr/include or some path common to the development team) in a parent folder. In this case, it is enough to specify the header folder in the source file (e.g. #include <opencv/cv.h>), if the parent folder is already in the path.
The only problem I can think of with this alternative is the case of a system where all the headers are in a single folder. However, this alternative prevents ambiguities (e.g. if two libraries have a vector.h header), makes it easier to set up another build system, and is probably more efficient regard to the search of the header file by the pre-compiler.
Given this analysis, I would tend toward the alternative, but a vast majority of code I found on internet use the first. For example, Google returns around 218000 results for "#include <cv.h>", versus 79100 for "#include <opencv/cv.h>". Am I missing advantages of the common way, or disadvantages of the alternative?

My personal preference is to have <opencv/cv.h>.
Why ? Because I am human, with a limited brain, and much more important things to do than remember that cv.h comes from the opencv library.
Therefore, even though the libraries I work with are always in dedicated folders, I structure them as:
<specific library folder>/include/<library name>/...
This helps me remember where those headers come from.
I have known people saying it was useless, and that IDEs would bring you straight to the file anyway... but
I don't always use an IDE
I don't always want to open each include file merely to know which libraries this particular file is tied to
It also makes it much easier to organize the include list (and group related includes together).

They have slightly different purposes, and I think need to be used carefully.
Consider the fact that in the -I case it was /pathto/opencv. You're suggesting possibly #include <opencv/cv.h> but you'd never write #include </pathto/opencv/cv.h>. There's a reason for that, which is that you expect cv.h is always in a directory called opencv, because that's how it's always released, whereas pathto is just where that library's files happen to have been installed on your machine (or on your distribution, whatever).
Anything that could conceivably differ depending where your code is being compiled, should be in the include path, so that it can be configured without modifying your source. Anything that is guaranteed to be the same wherever that particular cv.h is used can appear in the source, but it doesn't have to, so we need to decide whether we want it there.
As you've already noticed, it's useful to have it there as a disambiguator, especially for a file with a two-character name, but if you think that someone might want to put cv.h in a different place, then you should leave it out. That's pretty much the trade-off you're making - the header should always be in an opencv directory, but is it worth it to you to rely on that as a guarantee, as the price of disambiguating?

I would guess the main problem with using #include <opencv/cv.h> is that you don't necessarily want to add an entire parent path.
Taking an example where it has been installed to /usr/local when using the option with the full path, you'll need to add -I/usr/local/include to your command line. That could have all kinds of side effects.
E.g. for a completely different application, someone may have installed GNU iconv libraries there. Then suddenly, your application, which is also doing #include <iconv.h> is grabbing the headers from the standalone iconv library, instead of the implementation in glibc.
Obviously these are problems that do crop up from time to time, but by including more specific directories, you can hopefully minimise them.

Reasons for the first version:
you are not always able to install the headers/library in a standard path. Some time you do not have root access.
you do not want to pollute the path adding all the path for all the libraries you have installed.
for example you can have 2 version of the same library installed (in your case openCV) and you want some of your projects to be compiled with one library and some other with the other library. If you put them in the path then you get a name clash. For me this is one of the main reason (exactly for OpenCv I have version 1.x and 2.x installed and some projects compiled with 1.x and some with 2.x).

Check for precompiled headers with autotools?

How can I check if gcc precompiled headers are supported with autoconf? Is there a macro like AC_CHECK_GCH? The project I'm working on has a lot of templates and includes, I have tried writing a .h with the most commonly used includes and compiling it manually. It could be nice to integrate it with the rest of autotools.

It's not clear what you are hoping to accomplish. Are you hoping to distribute a precompiled header in your tarball? doing so would be almost completely useless. (I actually think it would be completely useless, but I say "almost" because I might be missing something.)
The system headers on the target box (the machine on which your project is being built) are almost certainly different than the ones on your development box. If they are the same, then there's no need to be using autoconf.
If the user of your package happens to be using gcc and wants to use precompiled headers, then they will put the .gch files in the appropriate location and gcc will use them. You don't need to do anything in your package.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js