What files are actually included when compiling - c++

I have a very large code, a lot of which is legacy code.
I want to know which of all these files are taking part in the compilation.
The code is written in GNU compilers and mostly in C/C++, but some in other programs too.
Any advice will be highly appreciated.
Thanks,
Moshe.
I am compiling under linux with a mix of scripts/makefiles. I want to somehow 'wrap' this build with a tool which will give an output of all the source files used in the build, preferably with absolute path names.
What do you say?

If you want to show included headers then whether that's supported and how to do it depends on the compiler.
E.g.,
C:\test> (g++ --help --verbose 2>&1) | find "header"
-print-sysroot-headers-suffix Display the sysroot suffix used to find headers
--sysroot=<directory> Use <directory> as the root directory for headers
-H Print the name of header files as they are used
-MG Treat missing header files as generated files
-MM Like -M but ignore system header files
-MMD Like -MD but ignore system header files
-MP Generate phony targets for all headers
-Wsystem-headers Do not suppress warnings from system headers
-print-objc-runtime-info Generate C header of platform-specific features
-ftree-ch Enable loop header copying on trees
C:\test> (cl /? 2>&1) | find "include"
/FI<file> name forced include file /U<name> remove predefined macro
/u remove all predefined macros /I<dir> add to include search path
/nologo suppress copyright message /showIncludes show include file names
C:\test> _
In the above you can see the relevant options for respectively g++ and Visual C++.
Cheers & hth.,
– Alf

For a given compilation unit, e.g. foo.cpp, add the flags -E -g3 to the call of g++.
This gives you the preprocessed code. There you can look which things are included.

Two options come to mind.
Parse the compilation log
Run a build, save the log, and then search in the log.
Find the files that are opened during the compilation time.
A way to do that might be to use a system tracing tool like strace or library tracing tool like ltrace and then look out for file open calls.
See also How can I detect file accesses in Linux?

How do you build the application? I.e. what do you type at the terminal to build it?
Depending on your answer to (1), find the relevant program used for the build (i.e. make, scons, etc.)
Now find the input file(s) to that build program, like Makefile, SConstruct, etc.
Look into this build file and other build files used by it to figure out which source files go into the build

Here is a technique that finds all include files using make.
It is non intrusive so you don't need to make any changes to files, or even to actually compile. Make will do all the work for you.
make -d
will run make and emit lots and lots of lines describing the inner processing of the make process. The most important is the consideration of dependencies.
Parsing the output it is easy to find the dependencies, and all other files.
Here is a Linux command line that gets a sorted list of directories that contain include files:
make -d | awk '/Prerequisite/ { if(match($2,".(.*)(/)(.*\\.h)",m)) { c[m[1]]++ ; } } END {for(d in c) print "\"" d "\",";} ' | sort
In this case the directories are quoted and a comma is added at the end, so the ouput is ready to be included in Visual Studio Code (vscode) configuration file c_cpp_properties.json
Simple variations can produce the grand list of include dependencies, like so:
make -d | awk '/Prerequisite/ { if(match($2,".(.*\\.h)",m)) { c[m[1]]++ ; } } END {for(d in c) print d ;} ' | sort
This should also work with targets (e.g. make All)

Related

How To Get g++ to list paths to all #included files

I would like to have g++/gcc tell me the paths to everything non-system it is #include-ing in C++ build. Turns out, that is a tough search as Google mus-interprets it about ten different ways.
I want these filenames and paths so I can add them to the search path for Exuberant CTAGS. We have a huge project and if I use ctags on the whole thing it takes about half an hour to generate the tags file and nearly as long for the editor to do a look-up.
We use CMakeLisats to do the compiling. If there is a directive I can paste into the CMakeLists.txt, that would be extra wonderfulness.
I don't really need the default paths and filenames, Johnathan Wakely gave a good tool for that here. I think that pretty much covers the fact that this is a cross compile job. I don't need the cross-system files either.
Try gcc or g++ with the -H option (to the preprocessor part of it). From the doc:
-H
Print the name of each header file used, in addition to other normal activities. Each name is indented to show how deep in the ‘#include’ stack it is. Precompiled header files are also printed, even if they are found to be invalid; an invalid precompiled header file is printed with ‘...x’ and a valid one with ‘...!’ .
It tells you all the headers which are included. You may filter out (with grep -v or awk) those that you don't want.
You could also consider developing your GCC plugin to register these headers somewhere (e.g. in your sqlite database), perhaps inspired by this draft report, or the CHARIOT or DECODER European projects. You could also consider using, or extending, the Clang static analyzer.
In contrast to the -M options suggested in Oliver Matthews' answer, it does not tell you more (but gives all the included files).
You need to invoke g++ with the -M option.
From the manual:
Instead of outputting the result of preprocessing, output a rule
suitable for make describing the dependencies of the main source file.
The preprocessor outputs one make rule containing the object file name
for that source file, a colon, and the names of all the included
files, including those coming from -include or -imacros command line
options.
It's worth reading the manual to consider the other -M sub options (-MM and -MF in particular may be of use).

Best practice for dependencies on #defines?

Is there a best practice for supporting dependencies on C/C++ preprocessor flags like -DCOMPILE_WITHOUT_FOO? Here's my problem:
> setenv COMPILE_WITHOUT_FOO
> make <Make system reads environment, sets -DCOMPILE_WITHOUT_FOO>
<Compiles nothing, since no source file has changed>
What I would like to do is have all files that rely on #ifdef statements get recompiled:
> setenv COMPILE_WITHOUT_FOO
> make
g++ FileWithIfdefFoo.cpp
What I do not want to is have to recompile everything if the value of COMPILE_WITHOUT_FOO has not changed.
I have a primitive Python script working (see below) that basically writes a header file FooDefines.h and then diffs it to see if anything is different. If it is, it replaces FooDefines.h and then the conventional source file dependency takes over. The define is not passed on the command line with -D. The disadvantage is that I now have to include FooDefines.h in any source file that uses the #ifdef, and also I have a new, dynamically generated header file for every #ifdef. If there's a tool to do this, or a way to avoid using the preprocessor, I'm all ears.
import os, sys
def makeDefineFile(filename, text):
tmpDefineFile = "/tmp/%s%s"%(os.getenv("USER"),filename) #Use os.tempnam?
existingDefineFile = filename
output = open(tmpDefineFile,'w')
output.write(text)
output.close()
status = os.system("diff -q %s %s"%(tmpDefineFile, existingDefineFile))
def checkStatus(status):
failed = False
if os.WIFEXITED(status):
#Check return code
returnCode = os.WEXITSTATUS(status)
failed = returnCode != 0
else:
#Caught a signal, coredump, etc.
failed = True
return failed,status
#If we failed for any reason (file didn't exist, different, etc.)
if checkStatus(status)[0]:
#Copy our tmp into the new file
status = os.system("cp %s %s"%(tmpDefineFile, existingDefineFile))
failed,status = checkStatus(status)
print failed, status
if failed:
print "ERROR: Could not update define in makeDefine.py"
sys.exit(status)
This is certainly not the nicest approach, but it would work:
find . -name '*cpp' -o -name '*h' -exec grep -l COMPILE_WITHOUT_FOO {} \; | xargs touch
That will look through your source code for the macro COMPILE_WITHOUT_FOO, and "touch" each file, which will update the timestamp. Then when you run make, those files will recompile.
If you have ack installed, you can simplify this command:
ack -l --cpp COMPILE_WITHOUT_FOO | xargs touch
I don't believe that it is possible to determine automagically. Preprocessor directives don't get compiled into anything. Generally speaking, I expect to do a full recompile if I depend on a define. DEBUG being a familiar example.
I don't think there is a right way to do it. If you can't do it the right way, then the dumbest way possible is probably the your best option. A text search for COMPILE_WITH_FOO and create dependencies that way. I would classify this as a shenanigan and if you are writing shared code I would recommend seeking pretty significant buy in from your coworkers.
CMake has some facilities that can make this easier. You would create a custom target to do this. You may trade problems here though, maintaining a list of files that depend on your symbol. Your text search could generate that file if it changed though. I've used similar techniques checking whether I needed to rebuild static data repositories based on wget timestamps.
Cheetah is another tool which may be useful.
If it were me, I think I'd do full rebuilds.
Your problem seems tailor-made to treat it with autoconf and autoheader, writing the values of the variables into a config.h file. If that's not possible, consider reading the "-D" directives from a file and writing the flags into that file.
Under all circumstances, you have to avoid builds that depend on environment variables only. You have no way of telling when the environment changed. There is a definitive need to store the variables in a file, the cleanest way would be by autoconf, autoheader and a source and multiple build trees; the second-cleanest way by re-configure-ing for each switch of compile context; and the third-cleanest way a file containing all mutable compiler switches on which all objects dependant on these switches depend themselves.
When you choose to implement the third way, remember not to update this file unnecessarily, e.g. by constructing it in a temporary location and copying it conditionally on diff, and then make rules will be capable of conditionally rebuilding your files depending on flags.
One way to do this is to store each #define's previous value in a file, and use conditionals in your makefile to force update that file whenever the current value doesn't match the previous. Any files which depend on that macro would include the file as a dependency.
Here is an example. It will update file.o if either file.c changed or the variable COMPILE_WITHOUT_FOO is different from last time. It uses $(shell ) to compare the current value with the value stored in the file envvars/COMPILE_WITHOUT_FOO. If they are different, then it creates a command for that file which depends on force, which is always updated.
file.o: file.c envvars/COMPILE_WITHOUT_FOO
gcc -DCOMPILE_WITHOUT_FOO=$(COMPILE_WITHOUT_FOO) $< -o $#
ifneq ($(strip $(shell cat envvars/COMPILE_WITHOUT_FOO 2> /dev/null)), $(strip $(COMPILE_WITHOUT_FOO)))
force: ;
envvars/COMPILE_WITHOUT_FOO: force
echo "$(COMPILE_WITHOUT_FOO)" > envvars/COMPILE_WITHOUT_FOO
endif
If you want to support having macros undefined, you will need to use the ifdef or ifndef conditionals, and have some indication in the file that the value was undefined the last time it was run.
Jay pointed out that "make triggers on date time stamps on files".
Theoretically, you could have your main makefile, call it m1, include variables from a second makefile called m2. m2 would contain a list of all the preprocessor flags.
You could have a make rule for your program depend on m2 being up-to-date.
the rule for making m2 would be to import all the environment variables ( and thus the #include directives ).
the trick would be, the rule for making m2 would detect if there was a diff from the previous version. If so, it would enable a variable that would force a "make all" and/or make clean for the main target. otherwise, it would just update the timestamp on m2 and not trigger a full remake.
finally, the rule for the normal target (make all ) would source in the preprocessor directives from m2 and apply them as required.
this sounds easy/possible in theory, but in practice GNU Make is much harder to get this type of stuff to work. I'm sure it can be done though.
make triggers on date time stamps on files. A dependent file being newer than what depends on it triggers it to recompile. You'll have to put your definition for each option in a separate .h file and ensure that those dependencies are represented in the makefile. Then if you change an option the files dependent on it would be recompiled automatically.
If it takes into account include files that include files you won't have to change the structure of the source. You could include a "BuildSettings.h" file that included all the individual settings files.
The only tough problem would be if you made it smart enough to parse the include guards. I've seen problems with compilation because of include file name collisions and order of include directory searches.
Now that you mention it I should check and see if my IDE is smart enough to automatically create those dependencies for me. Sounds like an excellent thing to add to an IDE.

Determine list of source files (*.[ch]) for a complex build with scons

Suppose you have a complex source tree for a C project, lots of directories with lots of files. The scons build supports multiple targets (i386, sparc, powerpc) and multiple variants (debug, release). There's an sconstruct at the root (referencing various sconscripts) that does the right thing for all of these, when called with arguments specifying target and variant, e.g. scons target=i386 variant=release.
Is there an easy way to determine which source files (*.c and *.h) each of these builds will use (they are all slightly different)? My theory is that scons needs to compute this file set anyway to know which files to compile and when to recompile. Can it provide this information?
What I do not want to do:
Log a verbose build and postprocess it (probably wouldn't tell *.h files anyway)
find . -name '*.[ch]' also prints unwanted files for unit testing and other cruft and is not target specific
Ideally I would like to do scons target=i386 variant=release printfileset and see the proper list of *.[ch] files. This list could then serve as input for further source file munging tools like doxygen.
There are a few questions all squashed together here:
You can prevent SCons from running the compiler using the --dry-run flag
You can get a dependency tree from SCons by using --debug=tree, or --tree=all flags, depending on which version you are running
Given a list of files, one per line, you can use grep to filter out only the things that are interesting for you.
When you put all of that together you end up with something like:
scons target=i386 variant=release printfileset -n --tree=all | egrep -i '^ .*\.(c|h|cpp|cxx|hpp|inl)$'

Compiling C++Builder project on command line

Is there a way to compile a C++Builder project (a specific build configuration) from the command line?
Something like:
CommandToBuild ProjectNameToBuild BuildConfiguration ...
There are different ways for automating your builds in C++Builder (as of my experience, I'm speaking about old C++Builder versions like 5 and 6).
You can manually call compilers - bcc32.exe (also dcc32.exe, brcc32.exe and tasm32.exe if you have to compile Delphi units, resource files or assembly language lines of code in your sources) and linker - ilink32.exe.
In this case, you will need to manually provide the necessary input files, paths, and keys as arguments for each stage of compilation and linking.
All data necessary for compilation and linking is stored in project files and, hopefully there are special utilities, included in the C++Builder installation, which can automate this dirty work, provide necessary parameters to compilers and linker and run them. Their names are bpr2mak.exe and make.exe.
First you have to run bpr2mak.exe, passing your project *.bpr or *.bpk file as a parameter and then you will get a special *.mak file as output, which you can use to feed on make.exe, which finally will build your project.
Look at this simple cmd script:
#bpr2mak.exe YourProject.bpr
#ren YourProject.mak makefile
#make.exe
You can provide the real name of "YourProject.mak" as a parameter to make.exe, but the most straightforward way is to rename the *.mak file to "makefile", and then make.exe will find it.
To have different build options, you can do the following:
The first way: you can open your project in the IDE, edit options and save it with a different project name in the same folder (usually there are two project files for debug and release compile options). Then you can provide your building script with different *.bpr files. This way, it looks simple, because it doesn't involves scripting, but the user will have to manually maintain coherency of all project files if something changes (forms or units added and so on).
The second way is to make a script which edits the project file or make file. You will have to parse files, find compiler and linker related lines and put in the necessary keys. You can do it even in a cmd script, but surely a specialised scripting language like Python is preferable.
Use:
msbuild project.cbproj /p:config=[build configuration]
More specifics can be found in Building a Project Using an MSBuild Command.
A little detail not mentioned.
Suppose you have external dependencies and that the .dll file does not initially exist in your folder
You will need to include the external dependencies in the ILINK32.CFG file.
This file is usually in the folder
C:\Program Files (x86)\Borland\CBuilder6\Bin\ilink32.cfg
(consider your installation location)
In this file, place the note for your dependencies.
Example: A dependency for TeeChart, would look like this (consider the last parameter):
-L"C:\Program Files (x86)\Borland\CBuilder6\lib";"C:\Program Files (x86)\Borland\CBuilder6\lib\obj";"C:\Program Files (x86)\Borland\CBuilder6\lib\release";"C:\Program Files (x86)\Steema Software\TeeChart 805 for Builder 6\Builder6\Include\";"C:\Program Files (x86)\Steema Software\TeeChart 805 for Builder 6\Builder6\Lib\"
You will also need to include the -f command to compile.
In cmd, do:
//first generate the file.mak
1 - bpr2mak.exe MyProject.bpr
//then compile the .mak
2 - make.exe -f MyProject.mak
You can also generate a temporary mak file with another name, as the answer above says, directly with bpr2mak
bpr2mak.exe MyProject.bpr -oMyTempMak.mak

Help with rake dependency mapping

I'm writing a Rakefile for a C++ project. I want it to identify #includes automatically, forcing the rebuilding of object files that depend on changed source files. I have a working solution, but I think it can be better. I'm looking for suggestions for:
Suggestions for improving my function
Libraries, gems, or tools that do the work for me
Links to cool C++ Rakefiles that I should check out that do similar things
Here's what I have so far. It's a function that returns the list of dependencies given a source file. I feed in the source file for a given object file, and I want a list of files that will force me to rebuild my object file.
def find_deps( file )
deps = Array.new
# Find all include statements
cmd = "grep -r -h -E \"#include\" #{file}"
includes = `#{cmd}`
includes.each do |line|
dep = line[ /\.\/(\w+\/)*\w+\.(cpp|h|hpp)/ ]
unless dep.nil?
deps << dep # Add the dependency to the list
deps += find_deps( dep )
end
end
return deps
end
I should note that all of my includes look like this right now:
#include "./Path/From/Top/Level/To/My/File.h" // For top-level files like main.cpp
#include "../../../Path/From/Top/To/My/File.h" // Otherwise
Note that I'm using double quotes for includes within my project and angle brackets for external library includes. I'm open to suggestions on alternative ways to do my include pathing that make my life easier.
Use the gcc command to generate a Make dependency list instead, and parse that:
g++ -M -MM -MF - inputfile.cpp
See man gcc or info gcc for details.
I'm sure there are different schools of thought with respect to what to put in #include directives. I advise against putting the whole path in your #includes. Instead, set up the proper include paths in your compile command (with -I). This makes it easier to relocate files in the future and more readable (in my opinion). It may sound minor, but the ability to reorganize as a project evolves is definitely valuable.
Using the preprocessor (see #greyfade) to generate the dependency list has the advantage that it will expand the header paths for you based on your include dirs.
Update: see also the Importing Dependencies section of the Rakefile doc for a library that reads the makefile dependency format.